[{"content":" Subtitle / Summary\nIf Subsets teaches the skeleton of combination-style backtracking, Permutations teaches the core of state-based backtracking: at each position, choose one unused element, continue until the path length reaches n, and only then collect the answer.\nReading time: 10-12 min Tags: Hot100, backtracking, permutations, DFS SEO keywords: Permutations, backtracking, used array, DFS, LeetCode 46 Meta description: Learn the stable permutation backtracking template for LeetCode 46, with state recovery, engineering analogies, and runnable multi-language solutions. Target Readers Hot100 learners who have already finished 78. Subsets and want the next backtracking template Developers who understand recursion but still make mistakes when restoring state Engineers who need to enumerate execution orders, test sequences, or ordering-sensitive plans Background / Motivation The key difference between combinations and permutations is simple:\ncombinations care about which elements are chosen permutations also care about the order of those elements So in this problem, [1,2,3] and [1,3,2] are different valid answers.\nThat immediately changes the template:\nstartIndex is no longer enough every layer must be able to consider all positions again we need explicit state to record which elements are already used That is exactly why LeetCode 46 is a foundational backtracking problem.\nIt forces you to reason clearly about state selection and state recovery.\nCore Concepts path: the permutation currently being built used[i]: whether nums[i] has already been placed in the current path Leaf-only collection: only when path.length == nums.length do we have a full permutation State recovery: on return, both path and used[i] must be restored A — Algorithm Problem Restatement Given an array nums of distinct integers, return all possible permutations.\nThe answer may be returned in any order.\nInput / Output Name Type Description nums int[] array of distinct integers return List[List[int]] all possible permutations Example 1 input: nums = [1,2,3] output: [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]] Example 2 input: nums = [0,1] output: [[0,1],[1,0]] Example 3 input: nums = [1] output: [[1]] Constraints 1 \u0026lt;= nums.length \u0026lt;= 6 -10 \u0026lt;= nums[i] \u0026lt;= 10 all integers in nums are distinct C — Concepts What changes when moving from Subsets to Permutations In 78. Subsets, the key boundary is startIndex because order does not matter.\nIn permutations, every layer asks a different question:\nWhich unused element should fill the next position?\nThat means:\nwe do not use startIndex every layer iterates over the whole array used[] decides whether an element is still available answers are collected only at leaf nodes Search tree model For nums = [1,2,3], the tree begins like this:\n[] |- [1] | |- [1,2] | | |- [1,2,3] | |- [1,3] | |- [1,3,2] |- [2] |- [3] Unlike subsets, the intermediate nodes are only prefixes.\nA node becomes a valid answer only when all positions have been filled.\nThe stable template dfs(): if path length == n: collect answer return for i in [0 .. n-1]: if used[i]: continue choose nums[i] used[i] = true dfs() used[i] = false undo nums[i] Practical Steps Prepare res, path, and a Boolean array used Enter DFS and first check whether the path is already full Iterate over all indices Skip elements already used in the current path Choose one element, recurse, then restore state on return Runnable Python example:\nfrom typing import List def permute(nums: List[int]) -\u0026gt; List[List[int]]: res: List[List[int]] = [] path: List[int] = [] used = [False] * len(nums) def dfs() -\u0026gt; None: if len(path) == len(nums): res.append(path.copy()) return for i, x in enumerate(nums): if used[i]: continue used[i] = True path.append(x) dfs() path.pop() used[i] = False dfs() return res if __name__ == \u0026#34;__main__\u0026#34;: print(permute([1, 2, 3])) print(permute([0, 1])) E — Engineering Scenario 1: task execution order enumeration (Python) Background: an offline scheduler wants to compare how different task orders affect the final result.\nWhy it fits: when order changes behavior, the search space is permutation-shaped.\ndef orders(tasks): if not tasks: return [[]] res = [] for i, task in enumerate(tasks): for rest in orders(tasks[:i] + tasks[i + 1:]): res.append([task] + rest) return res print(orders([\u0026#34;fetch\u0026#34;, \u0026#34;score\u0026#34;, \u0026#34;notify\u0026#34;])) Scenario 2: API regression order testing (Go) Background: the same set of API calls may trigger different cache or state paths when called in different orders.\nWhy it fits: validating order sensitivity is directly a permutation problem.\npackage main import \u0026#34;fmt\u0026#34; func permute(items []string) [][]string { if len(items) == 0 { return [][]string{{}} } res := make([][]string, 0) for i, item := range items { rest := append([]string{}, items[:i]...) rest = append(rest, items[i+1:]...) for _, tail := range permute(rest) { res = append(res, append([]string{item}, tail...)) } } return res } func main() { fmt.Println(permute([]string{\u0026#34;login\u0026#34;, \u0026#34;query\u0026#34;, \u0026#34;logout\u0026#34;})) } Scenario 3: animation order exploration (JavaScript) Background: during UI prototyping, a team wants to try several orders of animation steps.\nWhy it fits: different step orders produce different user experiences.\nfunction permute(items) { if (items.length === 0) return [[]]; const res = []; for (let i = 0; i \u0026lt; items.length; i += 1) { const rest = items.slice(0, i).concat(items.slice(i + 1)); for (const tail of permute(rest)) { res.push([items[i], ...tail]); } } return res; } console.log(permute([\u0026#34;fade\u0026#34;, \u0026#34;scale\u0026#34;, \u0026#34;slide\u0026#34;])); R — Reflection Complexity Time complexity: O(n * n!) Auxiliary recursion space: O(n) If output is counted, total space is dominated by the n! answer set Comparison with Subsets Problem Nature When to collect Key state 78. Subsets combinations every node startIndex 46. Permutations permutations leaf nodes only used[] Common mistakes forgetting to restore used[i] collecting answers at DFS entry and accidentally storing incomplete prefixes trying to solve permutations with startIndex and missing order variants Best Practices Think of this as “fill the next position” rather than “pick the next number” Restore path and used as a pair Draw a 3-level search tree before coding if the recursion feels abstract Keep the distinction between combination templates and permutation templates explicit in your notes S — Summary The real lesson of LeetCode 46 is state control through used[] Permutations collect answers only at leaf nodes because only leaves represent full results Compared with Subsets, this problem is less about boundaries and more about state recovery Once this template is stable, many order-sensitive search problems become much easier Suggested Next Problems 78. Subsets: combination-style backtracking template 17. Letter Combinations of a Phone Number: fixed-depth DFS 47. Permutations II: permutations with duplicate handling 51. N-Queens: state-heavy constrained search CTA After reading this, try rewriting the template once from memory and explain out loud why used[] is necessary.\nThat one habit will make the distinction between combinations and permutations much harder to forget.\nMulti-Language Implementations Python from typing import List def permute(nums: List[int]) -\u0026gt; List[List[int]]: res: List[List[int]] = [] path: List[int] = [] used = [False] * len(nums) def dfs() -\u0026gt; None: if len(path) == len(nums): res.append(path.copy()) return for i, x in enumerate(nums): if used[i]: continue used[i] = True path.append(x) dfs() path.pop() used[i] = False dfs() return res C #include \u0026lt;stdbool.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef struct { int** data; int* col_sizes; int size; int capacity; } Result; static void push_result(Result* res, int* path, int n) { if (res-\u0026gt;size == res-\u0026gt;capacity) { res-\u0026gt;capacity *= 2; res-\u0026gt;data = realloc(res-\u0026gt;data, sizeof(int*) * res-\u0026gt;capacity); res-\u0026gt;col_sizes = realloc(res-\u0026gt;col_sizes, sizeof(int) * res-\u0026gt;capacity); } int* row = malloc(sizeof(int) * n); for (int i = 0; i \u0026lt; n; ++i) row[i] = path[i]; res-\u0026gt;data[res-\u0026gt;size] = row; res-\u0026gt;col_sizes[res-\u0026gt;size] = n; res-\u0026gt;size += 1; } static void dfs(int* nums, int n, bool* used, int* path, int depth, Result* res) { if (depth == n) { push_result(res, path, n); return; } for (int i = 0; i \u0026lt; n; ++i) { if (used[i]) continue; used[i] = true; path[depth] = nums[i]; dfs(nums, n, used, path, depth + 1, res); used[i] = false; } } int** permute(int* nums, int nums_size, int* return_size, int** return_column_sizes) { Result res = {0}; res.capacity = 16; res.data = malloc(sizeof(int*) * res.capacity); res.col_sizes = malloc(sizeof(int) * res.capacity); bool* used = calloc(nums_size, sizeof(bool)); int* path = malloc(sizeof(int) * nums_size); dfs(nums, nums_size, used, path, 0, \u0026amp;res); free(used); free(path); *return_size = res.size; *return_column_sizes = res.col_sizes; return res.data; } C++ #include \u0026lt;vector\u0026gt; using namespace std; class Solution { public: vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt; permute(vector\u0026lt;int\u0026gt;\u0026amp; nums) { vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt; res; vector\u0026lt;int\u0026gt; path; vector\u0026lt;int\u0026gt; used(nums.size(), 0); dfs(nums, used, path, res); return res; } private: void dfs(const vector\u0026lt;int\u0026gt;\u0026amp; nums, vector\u0026lt;int\u0026gt;\u0026amp; used, vector\u0026lt;int\u0026gt;\u0026amp; path, vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt;\u0026amp; res) { if ((int)path.size() == (int)nums.size()) { res.push_back(path); return; } for (int i = 0; i \u0026lt; (int)nums.size(); ++i) { if (used[i]) continue; used[i] = 1; path.push_back(nums[i]); dfs(nums, used, path, res); path.pop_back(); used[i] = 0; } } }; Go package main func permute(nums []int) [][]int { res := make([][]int, 0) path := make([]int, 0, len(nums)) used := make([]bool, len(nums)) var dfs func() dfs = func() { if len(path) == len(nums) { snapshot := append([]int(nil), path...) res = append(res, snapshot) return } for i, x := range nums { if used[i] { continue } used[i] = true path = append(path, x) dfs() path = path[:len(path)-1] used[i] = false } } dfs() return res } Rust fn permute(nums: Vec\u0026lt;i32\u0026gt;) -\u0026gt; Vec\u0026lt;Vec\u0026lt;i32\u0026gt;\u0026gt; { fn dfs(nums: \u0026amp;[i32], used: \u0026amp;mut [bool], path: \u0026amp;mut Vec\u0026lt;i32\u0026gt;, res: \u0026amp;mut Vec\u0026lt;Vec\u0026lt;i32\u0026gt;\u0026gt;) { if path.len() == nums.len() { res.push(path.clone()); return; } for i in 0..nums.len() { if used[i] { continue; } used[i] = true; path.push(nums[i]); dfs(nums, used, path, res); path.pop(); used[i] = false; } } let mut res = Vec::new(); let mut path = Vec::new(); let mut used = vec![false; nums.len()]; dfs(\u0026amp;nums, \u0026amp;mut used, \u0026amp;mut path, \u0026amp;mut res); res } JavaScript function permute(nums) { const res = []; const path = []; const used = new Array(nums.length).fill(false); function dfs() { if (path.length === nums.length) { res.push([...path]); return; } for (let i = 0; i \u0026lt; nums.length; i += 1) { if (used[i]) continue; used[i] = true; path.push(nums[i]); dfs(); path.pop(); used[i] = false; } } dfs(); return res; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/backtracking/46-permutations/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nIf Subsets teaches the skeleton of combination-style backtracking, Permutations teaches the core of state-based backtracking: at each position, choose one unused element, continue until the path length reaches \u003ccode\u003en\u003c/code\u003e, and only then collect the answer.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebacktracking\u003c/code\u003e, \u003ccode\u003epermutations\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Permutations, backtracking, used array, DFS, LeetCode 46\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Learn the stable permutation backtracking template for LeetCode 46, with state recovery, engineering analogies, and runnable multi-language solutions.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who have already finished \u003ccode\u003e78. Subsets\u003c/code\u003e and want the next backtracking template\u003c/li\u003e\n\u003cli\u003eDevelopers who understand recursion but still make mistakes when restoring state\u003c/li\u003e\n\u003cli\u003eEngineers who need to enumerate execution orders, test sequences, or ordering-sensitive plans\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThe key difference between combinations and permutations is simple:\u003c/p\u003e","title":"Hot100: Permutations (used[] Backtracking ACERS Guide)"},{"content":" Subtitle / Summary\nSubsets is the cleanest entry point into Hot100 backtracking. The main thing to stabilize is not “enumerate everything”, but the three invariants behind the template: path, startIndex, and “every node is already a valid answer”.\nReading time: 10-12 min Tags: Hot100, backtracking, subsets, DFS SEO keywords: Subsets, backtracking, startIndex, power set, LeetCode 78 Meta description: Learn the stable backtracking template for LeetCode 78, with engineering analogies, pitfalls, and runnable multi-language solutions. Target Readers Hot100 learners starting the backtracking block today Developers who can write DFS but still mix up combinations and permutations Engineers who need to enumerate feature sets, candidate policies, or configuration bundles Background / Motivation Many “real” problems reduce to a subset model:\nwhich feature flags should be enabled together which rules should be combined into one experiment which filters should be included in a saved preset What makes LeetCode 78 valuable is that the problem is deliberately simple:\nall numbers are distinct there is no target sum there is no duplicate-removal complication That simplicity lets you focus on the template itself before adding pruning, fixed lengths, or duplicate handling.\nCore Concepts path: the current chosen elements on the recursion path startIndex: the first candidate index allowed in the current layer Preorder collection: in the subsets problem, every node in the search tree is already one valid answer Backtrack undo: after recursion returns, remove the last chosen element A — Algorithm Problem Restatement Given an integer array nums whose elements are all distinct, return all possible subsets (the power set).\nThe solution set must not contain duplicate subsets, and the answer order does not matter.\nInput / Output Name Type Description nums int[] array of distinct integers return List[List[int]] all possible subsets Example 1 input: nums = [1,2,3] output: [[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3]] Example 2 input: nums = [0] output: [[],[0]] Constraints 1 \u0026lt;= nums.length \u0026lt;= 10 -10 \u0026lt;= nums[i] \u0026lt;= 10 all elements of nums are distinct C — Concepts Why this is the right first backtracking problem This problem removes most secondary difficulty and leaves only the skeleton:\nstore the current choice in path decide where the next layer may start restore state after recursion That is exactly why it should come before problems like permutations or combination sum.\nThe search tree For nums = [1,2,3], the tree looks like this:\n[] |- [1] | |- [1,2] | | |- [1,2,3] | |- [1,3] |- [2] | |- [2,3] |- [3] Every node is a subset, so every node must be collected.\nThe stable template dfs(start): collect current path for i in [start .. n-1]: choose nums[i] dfs(i + 1) undo nums[i] i + 1 is the critical boundary.\nIt means later layers only look to the right, so [1,2] is generated once and [2,1] is never produced as a separate state.\nPractical Steps Create res and path Define dfs(startIndex) As soon as dfs begins, push a snapshot of path into the answer Iterate from startIndex to the end Choose one number, recurse with i + 1, then undo the choice Runnable Python example:\nfrom typing import List def subsets(nums: List[int]) -\u0026gt; List[List[int]]: res: List[List[int]] = [] path: List[int] = [] def dfs(start: int) -\u0026gt; None: res.append(path.copy()) for i in range(start, len(nums)): path.append(nums[i]) dfs(i + 1) path.pop() dfs(0) return res if __name__ == \u0026#34;__main__\u0026#34;: print(subsets([1, 2, 3])) print(subsets([0])) E — Engineering Scenario 1: feature-flag bundle generation (Python) Background: enumerate all possible flag bundles for offline experiment planning.\nWhy it fits: each flag is either included or not included, which is exactly a subset model.\ndef all_flag_sets(flags): res = [[]] for flag in flags: res += [old + [flag] for old in res] return res print(all_flag_sets([\u0026#34;new-ui\u0026#34;, \u0026#34;cache-v2\u0026#34;, \u0026#34;risk-guard\u0026#34;])) Scenario 2: policy-module candidate sets (Go) Background: a backend risk system wants to test all combinations of several rule modules.\nWhy it fits: “pick any subset of modules” is the same combinational space.\npackage main import \u0026#34;fmt\u0026#34; func subsets(items []string) [][]string { res := [][]string{{}} for _, item := range items { size := len(res) for i := 0; i \u0026lt; size; i++ { next := append([]string{}, res[i]...) next = append(next, item) res = append(res, next) } } return res } func main() { fmt.Println(subsets([]string{\u0026#34;ruleA\u0026#34;, \u0026#34;ruleB\u0026#34;, \u0026#34;ruleC\u0026#34;})) } Scenario 3: saved filter preset generation (JavaScript) Background: a frontend app wants to precompute filter presets for demos or regression coverage.\nWhy it fits: each filter can be enabled or disabled, so the full set of presets is a power set.\nfunction subsets(items) { const res = [[]]; for (const item of items) { const size = res.length; for (let i = 0; i \u0026lt; size; i += 1) { res.push([...res[i], item]); } } return res; } console.log(subsets([\u0026#34;tag\u0026#34;, \u0026#34;price\u0026#34;, \u0026#34;stock\u0026#34;])); R — Reflection Complexity Time complexity: O(n * 2^n) Auxiliary recursion space: O(n) Output size: O(n * 2^n) in total Alternatives Method Idea Strength Weakness Backtracking grow a path layer by layer best template for later problems requires a clear tree model Bitmask one bit means choose / skip short and compact less intuitive for future pruning problems Iterative expansion extend existing subsets by one new item elegant for this one problem less reusable when constraints become complex Common mistakes collecting only at leaf nodes and missing most subsets appending path directly instead of a copy restarting each layer from index 0 and accidentally generating permutation-like duplicates Best Practices Treat this as the base template for “combination-style” backtracking Always make a snapshot when storing path Ask yourself four questions while coding: What does path mean? Why collect here? Where does this layer start? What exactly is undone on return? S — Summary LeetCode 78 is the cleanest problem for building the backtracking skeleton startIndex is what makes this combinations/subsets logic, not permutations logic In subsets, every node is a valid answer, so collection happens before deeper recursion Once this template is stable, problems like permutations, combination sum, and subsets with duplicates become much easier Suggested Next Problems 46. Permutations: add used[] and learn permutation-style backtracking 17. Letter Combinations of a Phone Number: fixed-depth DFS 39. Combination Sum: add pruning and repeated use of the same candidate 90. Subsets II: handle duplicates cleanly CTA If this is your first backtracking problem today, write it once from memory after reading.\nThat is the fastest way to make the template stick.\nMulti-Language Implementations Python from typing import List def subsets(nums: List[int]) -\u0026gt; List[List[int]]: res: List[List[int]] = [] path: List[int] = [] def dfs(start: int) -\u0026gt; None: res.append(path.copy()) for i in range(start, len(nums)): path.append(nums[i]) dfs(i + 1) path.pop() dfs(0) return res C #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef struct { int** data; int* col_sizes; int size; int capacity; } Result; static void push_result(Result* res, int* path, int path_size) { if (res-\u0026gt;size == res-\u0026gt;capacity) { res-\u0026gt;capacity *= 2; res-\u0026gt;data = realloc(res-\u0026gt;data, sizeof(int*) * res-\u0026gt;capacity); res-\u0026gt;col_sizes = realloc(res-\u0026gt;col_sizes, sizeof(int) * res-\u0026gt;capacity); } int* row = malloc(sizeof(int) * path_size); for (int i = 0; i \u0026lt; path_size; ++i) row[i] = path[i]; res-\u0026gt;data[res-\u0026gt;size] = row; res-\u0026gt;col_sizes[res-\u0026gt;size] = path_size; res-\u0026gt;size += 1; } static void dfs(int* nums, int nums_size, int start, int* path, int path_size, Result* res) { push_result(res, path, path_size); for (int i = start; i \u0026lt; nums_size; ++i) { path[path_size] = nums[i]; dfs(nums, nums_size, i + 1, path, path_size + 1, res); } } int** subsets(int* nums, int nums_size, int* return_size, int** return_column_sizes) { Result res = {0}; res.capacity = 16; res.data = malloc(sizeof(int*) * res.capacity); res.col_sizes = malloc(sizeof(int) * res.capacity); int* path = malloc(sizeof(int) * nums_size); dfs(nums, nums_size, 0, path, 0, \u0026amp;res); free(path); *return_size = res.size; *return_column_sizes = res.col_sizes; return res.data; } C++ #include \u0026lt;vector\u0026gt; using namespace std; class Solution { public: vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt; subsets(vector\u0026lt;int\u0026gt;\u0026amp; nums) { vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt; res; vector\u0026lt;int\u0026gt; path; dfs(nums, 0, path, res); return res; } private: void dfs(const vector\u0026lt;int\u0026gt;\u0026amp; nums, int start, vector\u0026lt;int\u0026gt;\u0026amp; path, vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt;\u0026amp; res) { res.push_back(path); for (int i = start; i \u0026lt; (int)nums.size(); ++i) { path.push_back(nums[i]); dfs(nums, i + 1, path, res); path.pop_back(); } } }; Go package main func subsets(nums []int) [][]int { res := make([][]int, 0) path := make([]int, 0) var dfs func(int) dfs = func(start int) { snapshot := append([]int(nil), path...) res = append(res, snapshot) for i := start; i \u0026lt; len(nums); i++ { path = append(path, nums[i]) dfs(i + 1) path = path[:len(path)-1] } } dfs(0) return res } Rust fn subsets(nums: Vec\u0026lt;i32\u0026gt;) -\u0026gt; Vec\u0026lt;Vec\u0026lt;i32\u0026gt;\u0026gt; { fn dfs(nums: \u0026amp;[i32], start: usize, path: \u0026amp;mut Vec\u0026lt;i32\u0026gt;, res: \u0026amp;mut Vec\u0026lt;Vec\u0026lt;i32\u0026gt;\u0026gt;) { res.push(path.clone()); for i in start..nums.len() { path.push(nums[i]); dfs(nums, i + 1, path, res); path.pop(); } } let mut res = Vec::new(); let mut path = Vec::new(); dfs(\u0026amp;nums, 0, \u0026amp;mut path, \u0026amp;mut res); res } JavaScript function subsets(nums) { const res = []; const path = []; function dfs(start) { res.push([...path]); for (let i = start; i \u0026lt; nums.length; i += 1) { path.push(nums[i]); dfs(i + 1); path.pop(); } } dfs(0); return res; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/backtracking/78-subsets/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nSubsets is the cleanest entry point into Hot100 backtracking. The main thing to stabilize is not “enumerate everything”, but the three invariants behind the template: \u003ccode\u003epath\u003c/code\u003e, \u003ccode\u003estartIndex\u003c/code\u003e, and “every node is already a valid answer”.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebacktracking\u003c/code\u003e, \u003ccode\u003esubsets\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Subsets, backtracking, startIndex, power set, LeetCode 78\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Learn the stable backtracking template for LeetCode 78, with engineering analogies, pitfalls, and runnable multi-language solutions.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners starting the backtracking block today\u003c/li\u003e\n\u003cli\u003eDevelopers who can write DFS but still mix up combinations and permutations\u003c/li\u003e\n\u003cli\u003eEngineers who need to enumerate feature sets, candidate policies, or configuration bundles\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMany “real” problems reduce to a subset model:\u003c/p\u003e","title":"Hot100: Subsets (Backtracking / startIndex ACERS Guide)"},{"content":" Subtitle / Summary\nClone Graph is not a traversal-only problem. The real challenge is preserving graph structure while avoiding duplicate copies in the presence of cycles. The stable solution is a traversal plus a hash map from original nodes to cloned nodes.\nReading time: 12-15 min Tags: graph, dfs, bfs, hash map, deep copy SEO keywords: Clone Graph, graph deep copy, DFS, BFS, LeetCode 133 Meta description: Deep-copy an undirected graph with a node-to-node map, explaining why memoization is mandatory and how DFS/BFS versions work, with runnable code in six languages. Target Readers LeetCode learners practicing graph traversal and deep-copy patterns Engineers who duplicate object graphs, workflow graphs, or topology graphs Developers who want one reusable template for “clone with cycles” Background / Motivation Many “copy” problems are actually identity-preservation problems.\nFor arrays or flat objects, copying is straightforward.\nGraphs are different because:\na node can be reached from multiple paths the graph can contain cycles copying only values is not enough; edges must point to cloned neighbors, not original ones This is why Clone Graph is a classic interview and engineering problem:\ncopying workflow DAGs or small cyclic state machines duplicating editor nodes while preserving connections snapshotting a topology-like data structure before mutation Core Concepts Deep copy: every node in the returned graph is a newly created node Node identity: the key is the original node object/reference, not only val Adjacency structure: each cloned node must point to cloned neighbors in the same shape as the original graph Memo map: original_node -\u0026gt; cloned_node, used to avoid repeated cloning and infinite recursion A - Algorithm Problem Restatement You are given a reference to one node in a connected undirected graph.\nReturn a deep copy of the entire graph.\nEach node contains:\nclass Node { public int val; public List\u0026lt;Node\u0026gt; neighbors; } The graph in the test case is represented as an adjacency list.\nThe given node is always the node with value 1, unless the graph is empty.\nInput / Output Name Type Meaning node Node or null one node in the original graph return Node or null one node in the cloned graph Examples Example 1 Input: adjList = [[2,4],[1,3],[2,4],[1,3]] Output: [[2,4],[1,3],[2,4],[1,3]] Explanation:\nNode 1 connects to 2 and 4 Node 2 connects to 1 and 3 Node 3 connects to 2 and 4 Node 4 connects to 1 and 3 The cloned graph must have the same neighbor relationships, but all nodes must be newly allocated.\nExample 2 Input: adjList = [[]] Output: [[]] There is exactly one node and it has no neighbors.\nExample 3 Input: adjList = [] Output: [] The graph is empty, so the answer is null.\nConstraints The number of nodes is in the range [0, 100] 1 \u0026lt;= Node.val \u0026lt;= 100 Node.val is unique for each node There are no repeated edges and no self-loops The graph is connected and all nodes are reachable from the given node Thought Process: From Wrong Copying to the Correct Pattern Wrong idea 1: clone one node at a time without memory Suppose we do this:\ncreate a clone of the current node recursively clone every neighbor This breaks when the graph has a cycle.\nExample:\n1 -- 2 | | 4 -- 3 If you clone 1, then 2, then 1 again through the back edge, you create duplicate nodes or recurse forever.\nWrong idea 2: use only node values as a complete substitute for node objects In this LeetCode problem, values are unique, so value-based mapping happens to work.\nBut the transferable engineering pattern is:\nmap by original node identity/reference, not by coincidence of values.\nThat keeps the solution correct even when value uniqueness is not guaranteed in other systems.\nKey observation Each original node should be cloned exactly once.\nAfter that, every edge should reuse the already-created clone.\nThat leads directly to:\ntraversal: DFS or BFS memo map: original -\u0026gt; cloned C - Concepts Method Category Graph traversal Hash table / memoization Deep-copy construction Why the Memo Map Is Mandatory The memo map solves two problems at once:\nPrevents infinite loops on cyclic graphs Prevents duplicate clones when multiple paths reach the same node Without the map, the copied graph cannot preserve shared structure correctly.\nDFS Version The DFS idea is:\nif the input node is null, return null if the node has already been cloned, return the stored clone otherwise create a new clone and store it immediately recursively clone every neighbor and append the cloned neighbors return the cloned node Storing the clone before recursing is essential.\nThat is what breaks cycles safely.\nBFS Version The BFS version is equally valid:\nclone the starting node push the original node into a queue pop nodes level by level for each neighbor: create its clone if missing append the neighbor clone to the current clone enqueue the original neighbor if first seen DFS is usually shorter to write.\nBFS can feel more explicit if you prefer iterative traversal.\nCorrectness Intuition Once a node is first seen:\none clone is created the mapping remembers that clone forever So every future edge pointing to the original node can safely point to the same cloned node.\nThat ensures both:\nnode uniqueness in the clone edge structure preservation Reference Implementation Before moving to engineering scenarios, it helps to pin down the direct interview solution first.\nPython DFS from collections import deque class Node: def __init__(self, val=0, neighbors=None): self.val = val self.neighbors = neighbors if neighbors is not None else [] def build_graph(adj_list): if not adj_list: return None nodes = {i + 1: Node(i + 1) for i in range(len(adj_list))} for i, neighbors in enumerate(adj_list, start=1): nodes[i].neighbors = [nodes[val] for val in neighbors] return nodes[1] def graph_to_adj_list(node): if node is None: return [] seen = {node} queue = deque([node]) nodes_by_val = {} while queue: cur = queue.popleft() nodes_by_val[cur.val] = cur for nxt in cur.neighbors: if nxt not in seen: seen.add(nxt) queue.append(nxt) return [ [nxt.val for nxt in nodes_by_val[val].neighbors] for val in sorted(nodes_by_val) ] def clone_graph_dfs(node): copies = {} def dfs(cur): if cur is None: return None if cur in copies: return copies[cur] cloned = Node(cur.val) copies[cur] = cloned for nxt in cur.neighbors: cloned.neighbors.append(dfs(nxt)) return cloned return dfs(node) if __name__ == \u0026#34;__main__\u0026#34;: adj_list = [[2, 4], [1, 3], [2, 4], [1, 3]] original = build_graph(adj_list) cloned = clone_graph_dfs(original) print(graph_to_adj_list(cloned)) print(original is cloned) Python BFS from collections import deque class Node: def __init__(self, val=0, neighbors=None): self.val = val self.neighbors = neighbors if neighbors is not None else [] def build_graph(adj_list): if not adj_list: return None nodes = {i + 1: Node(i + 1) for i in range(len(adj_list))} for i, neighbors in enumerate(adj_list, start=1): nodes[i].neighbors = [nodes[val] for val in neighbors] return nodes[1] def graph_to_adj_list(node): if node is None: return [] seen = {node} queue = deque([node]) nodes_by_val = {} while queue: cur = queue.popleft() nodes_by_val[cur.val] = cur for nxt in cur.neighbors: if nxt not in seen: seen.add(nxt) queue.append(nxt) return [ [nxt.val for nxt in nodes_by_val[val].neighbors] for val in sorted(nodes_by_val) ] def clone_graph_bfs(node): if node is None: return None copies = {node: Node(node.val)} queue = deque([node]) while queue: cur = queue.popleft() for nxt in cur.neighbors: if nxt not in copies: copies[nxt] = Node(nxt.val) queue.append(nxt) copies[cur].neighbors.append(copies[nxt]) return copies[node] if __name__ == \u0026#34;__main__\u0026#34;: adj_list = [[2, 4], [1, 3], [2, 4], [1, 3]] original = build_graph(adj_list) cloned = clone_graph_bfs(original) print(graph_to_adj_list(cloned)) print(original is cloned) E - Engineering Scenario 1: Duplicating a Workflow Graph Template (Python) Background: a workflow editor stores nodes and outgoing links.\nWhy it fits: every duplicated workflow must preserve connections without sharing mutable nodes with the original.\ndef clone_adj(graph): copied = {} def dfs(u): if u in copied: return copied[u] copied[u] = [] for v in graph.get(u, []): dfs(v) copied[u].append(v) return copied[u] for u in graph: dfs(u) return copied workflow = {1: [2, 4], 2: [1, 3], 3: [2, 4], 4: [1, 3]} print(clone_adj(workflow)) Scenario 2: Cloning a Service Dependency Snapshot (Go) Background: before mutating a service dependency graph, you want a safe snapshot.\nWhy it fits: the graph may contain cycles, shared dependencies, and repeated reachability paths.\npackage main import \u0026#34;fmt\u0026#34; func cloneAdj(graph map[int][]int) map[int][]int { out := map[int][]int{} for u, ns := range graph { cp := make([]int, len(ns)) copy(cp, ns) out[u] = cp } return out } func main() { g := map[int][]int{1: {2, 4}, 2: {1, 3}, 3: {2, 4}, 4: {1, 3}} fmt.Println(cloneAdj(g)) } Scenario 3: Copy-Paste in a Frontend Node Editor (JavaScript) Background: a visual editor copies a graph of blocks and edges.\nWhy it fits: pasted blocks must point only to pasted blocks, never to the original graph.\nfunction cloneAdj(graph) { const out = {}; for (const [k, v] of Object.entries(graph)) { out[k] = [...v]; } return out; } const graph = {1: [2, 4], 2: [1, 3], 3: [2, 4], 4: [1, 3]}; console.log(cloneAdj(graph)); R - Reflection Complexity Let:\nn = number of nodes m = number of edges Then the DFS/BFS clone visits each node once and each edge once:\nTime: O(n + m) Space: O(n) The extra space comes from:\nthe memo map the recursion stack for DFS or the queue for BFS Alternatives DFS + hash map: shortest and most common BFS + hash map: equally correct, iterative Naive recursive copy without map: incorrect on cyclic graphs Common Mistakes Creating the clone after processing neighbors, which breaks cycles Forgetting to memoize the node before recursion Copying neighbor values instead of neighbor node references Returning a shallow copy where neighbor lists still point to original nodes Why This Solution Is the Most Practical This problem is fundamentally “graph traversal + identity-preserving duplication.”\nA memo map solves exactly the hard part, so DFS/BFS with mapping is both the cleanest interview solution and the most transferable engineering pattern.\nFAQ Why not map by val?\nIn this problem val is unique, so it works. But the more general and safer pattern is mapping original node references to cloned node references.\nWhy store the clone before traversing neighbors?\nBecause a cycle may revisit the same node immediately. The map entry must already exist when that happens.\nDFS or BFS, which one is better?\nNeither is asymptotically better here. DFS is shorter; BFS avoids recursion depth concerns.\nS - Summary Clone Graph is a deep-copy problem, not just a traversal problem. The essential data structure is a map from original nodes to cloned nodes. Memoizing before recursing is what makes cycles safe. DFS and BFS are both valid; the important invariant is one clone per original node. Further Reading LeetCode 138: Copy List with Random Pointer Graph traversal templates for DFS and BFS Deep-copy patterns for cyclic object graphs Next Step Try rewriting the same solution in both DFS and BFS styles.\nIf you can switch between the two without changing the memo-map invariant, you fully understand the problem.\nMulti-language Implementations Python from typing import Optional class Node: def __init__(self, val: int = 0, neighbors=None): self.val = val self.neighbors = neighbors if neighbors is not None else [] class Solution: def cloneGraph(self, node: Optional[\u0026#34;Node\u0026#34;]) -\u0026gt; Optional[\u0026#34;Node\u0026#34;]: copies = {} def dfs(cur: Optional[\u0026#34;Node\u0026#34;]) -\u0026gt; Optional[\u0026#34;Node\u0026#34;]: if cur is None: return None if cur in copies: return copies[cur] cloned = Node(cur.val) copies[cur] = cloned for nxt in cur.neighbors: cloned.neighbors.append(dfs(nxt)) return cloned return dfs(node) C /* * LeetCode provides the Node definition. * The core idea is: * 1. keep a map original -\u0026gt; cloned * 2. create the clone before recursing * * In pure C, the hash table implementation is verbose, so interview answers * often use C++/Go/Python for this problem. The algorithm itself is the same. */ C++ /* // Definition for a Node. class Node { public: int val; vector\u0026lt;Node*\u0026gt; neighbors; Node() { val = 0; neighbors = vector\u0026lt;Node*\u0026gt;(); } Node(int _val) { val = _val; neighbors = vector\u0026lt;Node*\u0026gt;(); } Node(int _val, vector\u0026lt;Node*\u0026gt; _neighbors) { val = _val; neighbors = _neighbors; } }; */ class Solution { public: unordered_map\u0026lt;Node*, Node*\u0026gt; copies; Node* cloneGraph(Node* node) { return dfs(node); } Node* dfs(Node* node) { if (!node) return nullptr; if (copies.count(node)) return copies[node]; Node* cloned = new Node(node-\u0026gt;val); copies[node] = cloned; for (Node* nxt : node-\u0026gt;neighbors) { cloned-\u0026gt;neighbors.push_back(dfs(nxt)); } return cloned; } }; Go /** * type Node struct { * Val int * Neighbors []*Node * } */ func cloneGraph(node *Node) *Node { copies := map[*Node]*Node{} var dfs func(*Node) *Node dfs = func(cur *Node) *Node { if cur == nil { return nil } if cp, ok := copies[cur]; ok { return cp } cloned := \u0026amp;Node{Val: cur.Val, Neighbors: []*Node{}} copies[cur] = cloned for _, nxt := range cur.Neighbors { cloned.Neighbors = append(cloned.Neighbors, dfs(nxt)) } return cloned } return dfs(node) } Rust use std::cell::RefCell; use std::collections::HashMap; use std::rc::Rc; type NodeRef = Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;; #[derive(Debug)] pub struct Node { pub val: i32, pub neighbors: Vec\u0026lt;NodeRef\u0026gt;, } fn clone_graph(node: Option\u0026lt;NodeRef\u0026gt;) -\u0026gt; Option\u0026lt;NodeRef\u0026gt; { fn dfs(cur: \u0026amp;NodeRef, copies: \u0026amp;mut HashMap\u0026lt;*const RefCell\u0026lt;Node\u0026gt;, NodeRef\u0026gt;) -\u0026gt; NodeRef { let key = Rc::as_ptr(cur); if let Some(existing) = copies.get(\u0026amp;key) { return existing.clone(); } let cloned = Rc::new(RefCell::new(Node { val: cur.borrow().val, neighbors: vec![] })); copies.insert(key, cloned.clone()); let neighbors = cur.borrow().neighbors.clone(); for nxt in neighbors { let cp = dfs(\u0026amp;nxt, copies); cloned.borrow_mut().neighbors.push(cp); } cloned } let mut copies = HashMap::new(); node.map(|n| dfs(\u0026amp;n, \u0026amp;mut copies)) } JavaScript /* // Definition for a Node. function Node(val, neighbors) { this.val = val === undefined ? 0 : val; this.neighbors = neighbors === undefined ? [] : neighbors; } */ var cloneGraph = function (node) { const copies = new Map(); function dfs(cur) { if (cur === null) return null; if (copies.has(cur)) return copies.get(cur); const cloned = new Node(cur.val); copies.set(cur, cloned); for (const nxt of cur.neighbors) { cloned.neighbors.push(dfs(nxt)); } return cloned; } return dfs(node); }; ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/133-clone-graph/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nClone Graph is not a traversal-only problem. The real challenge is preserving graph structure while avoiding duplicate copies in the presence of cycles. The stable solution is a traversal plus a hash map from original nodes to cloned nodes.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003egraph\u003c/code\u003e, \u003ccode\u003edfs\u003c/code\u003e, \u003ccode\u003ebfs\u003c/code\u003e, \u003ccode\u003ehash map\u003c/code\u003e, \u003ccode\u003edeep copy\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Clone Graph, graph deep copy, DFS, BFS, LeetCode 133\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Deep-copy an undirected graph with a node-to-node map, explaining why memoization is mandatory and how DFS/BFS versions work, with runnable code in six languages.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLeetCode learners practicing graph traversal and deep-copy patterns\u003c/li\u003e\n\u003cli\u003eEngineers who duplicate object graphs, workflow graphs, or topology graphs\u003c/li\u003e\n\u003cli\u003eDevelopers who want one reusable template for “clone with cycles”\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMany “copy” problems are actually identity-preservation problems.\u003c/p\u003e","title":"LeetCode 133: Clone Graph Hash Map + DFS/BFS ACERS Guide"},{"content":" Subtitle / Summary\nThis problem is a useful bridge between sorting and binary search. After sorting the array, all copies of target become one contiguous block, and the answer is simply every index inside that block.\nReading time: 10-12 min Tags: sorting, binary search, range location SEO keywords: Find Target Indices After Sorting Array, LeetCode 2089, lower bound, upper bound Meta description: Sort the array, use lower and upper bounds to find the target block, and return every matching index, with tradeoffs, engineering scenarios, and runnable implementations in six languages. Target Readers Learners connecting sorting with lower/upper bound search Engineers who need all positions of one value after offline sorting Interview candidates reviewing how contiguous blocks form in sorted data Background / Motivation The input array is not sorted, so we cannot apply binary search immediately.\nBut once we sort it, every copy of the same value becomes one continuous segment.\nThat gives a clean workflow:\nsort the data find where the target block starts find where the target block ends output all indices in that block This is the same “range of equal values” idea used in many sorted-data systems:\nleaderboard grouping equal-score buckets value-based offline analytics Core Concepts Sorted target block: equal values are contiguous after sorting Lower bound: first index i such that nums[i] \u0026gt;= target Upper bound: first index i such that nums[i] \u0026gt; target Answer range: all indices in [lower_bound(target), upper_bound(target)) A - Algorithm Problem Restatement Given an integer array nums and an integer target:\nsort nums in non-decreasing order return all indices where the sorted array equals target If target does not exist, return an empty array.\nInput / Output Name Type Meaning nums int[] unsorted integer array target int value to locate after sorting return int[] all indices of target in the sorted array Example 1 nums = [1, 2, 5, 2, 3] target = 2 sorted = [1, 2, 2, 3, 5] output = [1, 2] Example 2 nums = [1, 2, 5, 2, 3] target = 3 sorted = [1, 2, 2, 3, 5] output = [3] Example 3 nums = [1, 2, 5, 2, 3] target = 5 sorted = [1, 2, 2, 3, 5] output = [4] Thought Process: From Sort-and-Scan to Sort-and-Bounds The most direct solution is:\nsort the array scan the sorted result collect every index whose value equals target That is valid and easy to understand.\nIf you want to train the binary-search pattern, there is a cleaner post-sort observation:\nafter sorting, all targets are adjacent so the answer is one contiguous interval That means we can:\nfind the first target with lower bound find the first value greater than target with upper bound generate the index list from that interval C - Concepts Method Category Sorting + binary search Range discovery in sorted data Boundary search Why the Target Indices Form a Continuous Range After sorting:\nall values smaller than target appear first then all copies of target then all values greater than target So if:\nl = lower_bound(target) r = upper_bound(target) then the target indices are exactly:\n[l, l+1, ..., r-1] Stable Algorithm sort nums compute l = lower_bound(nums, target) compute r = upper_bound(nums, target) if l == r, return [] otherwise return all integers from l to r - 1 E - Engineering Scenario 1: Equal-Score Buckets After Offline Sort (Python) Background: scores are collected unsorted, then sorted for reporting.\nWhy it fits: equal scores become one contiguous block.\nfrom bisect import bisect_left, bisect_right scores = sorted([1, 2, 5, 2, 3]) l = bisect_left(scores, 2) r = bisect_right(scores, 2) print(list(range(l, r))) Scenario 2: Batch Value Grouping in Services (Go) Background: a batch job sorts values before producing grouped summaries.\nWhy it fits: the exact index block of one value is found by two boundaries.\npackage main import ( \u0026#34;fmt\u0026#34; \u0026#34;sort\u0026#34; ) func main() { nums := []int{1, 2, 5, 2, 3} sort.Ints(nums) l := sort.Search(len(nums), func(i int) bool { return nums[i] \u0026gt;= 2 }) r := sort.Search(len(nums), func(i int) bool { return nums[i] \u0026gt; 2 }) var ans []int for i := l; i \u0026lt; r; i++ { ans = append(ans, i) } fmt.Println(ans) } Scenario 3: Frontend Highlighting of Equal Rankings (JavaScript) Background: a UI sorts scores and highlights every position tied with one value.\nWhy it fits: ties become one adjacent segment after sorting.\nconst nums = [1, 2, 5, 2, 3].slice().sort((a, b) =\u0026gt; a - b); const ans = []; for (let i = 0; i \u0026lt; nums.length; i++) { if (nums[i] === 2) ans.push(i); } console.log(nums, ans); // [1,2,2,3,5] [1,2] R - Reflection Complexity For the sort + bounds solution:\nTime: O(n log n) because sorting dominates Space: O(1) extra if the language sort is in-place and we ignore implementation details otherwise depends on the sorting implementation Alternative: Direct Counting There is an important alternative:\ncount how many elements are \u0026lt; target count how many elements are == target build the answer range directly That approach is O(n) and is asymptotically better for this standalone problem.\nWhy Keep the Sort + Bounds Version in a Binary-Search Series Even though counting can be faster here, sort + lower/upper bound is still valuable because it teaches a reusable pattern:\nequal values form one block after sorting lower and upper bounds recover that block That same reasoning appears in many other problems and real systems.\nCommon Mistakes Sorting the array but still scanning the whole result after already knowing the boundaries Forgetting that upper_bound is exclusive Claiming this is strictly optimal without mentioning the linear counting alternative S - Summary After sorting, every copy of target becomes one contiguous segment. Lower and upper bounds recover the exact index interval of that segment. Returning range(l, r) is enough once the two boundaries are known. For this specific problem, direct counting is a valid faster alternative, but sort + bounds is the better teaching pattern for a binary-search series. Further Reading LeetCode 34: Find First and Last Position of Element in Sorted Array LeetCode 35: Search Insert Position Any standard documentation for bisect_left, bisect_right, lower_bound, and upper_bound Multi-language Implementations Python from bisect import bisect_left, bisect_right from typing import List def target_indices(nums: List[int], target: int) -\u0026gt; List[int]: nums = sorted(nums) l = bisect_left(nums, target) r = bisect_right(nums, target) return list(range(l, r)) if __name__ == \u0026#34;__main__\u0026#34;: print(target_indices([1, 2, 5, 2, 3], 2)) print(target_indices([1, 2, 5, 2, 3], 3)) print(target_indices([1, 2, 5, 2, 3], 5)) C #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; int cmp(const void *a, const void *b) { return (*(int *)a) - (*(int *)b); } int lowerBound(int *nums, int n, int target) { int l = 0, r = n; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } int upperBound(int *nums, int n, int target) { int l = 0, r = n; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } int main(void) { int nums[] = {1, 2, 5, 2, 3}; int n = sizeof(nums) / sizeof(nums[0]); qsort(nums, n, sizeof(int), cmp); int l = lowerBound(nums, n, 2); int r = upperBound(nums, n, 2); for (int i = l; i \u0026lt; r; i++) { printf(\u0026#34;%d \u0026#34;, i); } printf(\u0026#34;\\n\u0026#34;); return 0; } C++ #include \u0026lt;algorithm\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; using namespace std; vector\u0026lt;int\u0026gt; targetIndices(vector\u0026lt;int\u0026gt; nums, int target) { sort(nums.begin(), nums.end()); auto l = lower_bound(nums.begin(), nums.end(), target); auto r = upper_bound(nums.begin(), nums.end(), target); vector\u0026lt;int\u0026gt; ans; for (auto it = l; it != r; ++it) { ans.push_back((int)(it - nums.begin())); } return ans; } int main() { auto ans = targetIndices({1, 2, 5, 2, 3}, 2); for (int x : ans) cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } Go package main import ( \u0026#34;fmt\u0026#34; \u0026#34;sort\u0026#34; ) func targetIndices(nums []int, target int) []int { sort.Ints(nums) l := sort.Search(len(nums), func(i int) bool { return nums[i] \u0026gt;= target }) r := sort.Search(len(nums), func(i int) bool { return nums[i] \u0026gt; target }) ans := make([]int, 0, r-l) for i := l; i \u0026lt; r; i++ { ans = append(ans, i) } return ans } func main() { fmt.Println(targetIndices([]int{1, 2, 5, 2, 3}, 2)) fmt.Println(targetIndices([]int{1, 2, 5, 2, 3}, 3)) } Rust fn lower_bound(nums: \u0026amp;[i32], target: i32) -\u0026gt; usize { let (mut l, mut r) = (0usize, nums.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if nums[mid] \u0026gt;= target { r = mid; } else { l = mid + 1; } } l } fn upper_bound(nums: \u0026amp;[i32], target: i32) -\u0026gt; usize { let (mut l, mut r) = (0usize, nums.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if nums[mid] \u0026gt; target { r = mid; } else { l = mid + 1; } } l } fn target_indices(mut nums: Vec\u0026lt;i32\u0026gt;, target: i32) -\u0026gt; Vec\u0026lt;usize\u0026gt; { nums.sort(); let l = lower_bound(\u0026amp;nums, target); let r = upper_bound(\u0026amp;nums, target); (l..r).collect() } fn main() { println!(\u0026#34;{:?}\u0026#34;, target_indices(vec![1, 2, 5, 2, 3], 2)); println!(\u0026#34;{:?}\u0026#34;, target_indices(vec![1, 2, 5, 2, 3], 3)); } JavaScript function lowerBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } function upperBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } function targetIndices(nums, target) { nums = nums.slice().sort((a, b) =\u0026gt; a - b); const l = lowerBound(nums, target); const r = upperBound(nums, target); const ans = []; for (let i = l; i \u0026lt; r; i++) ans.push(i); return ans; } console.log(targetIndices([1, 2, 5, 2, 3], 2)); console.log(targetIndices([1, 2, 5, 2, 3], 3)); console.log(targetIndices([1, 2, 5, 2, 3], 5)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/binary-search/2089-find-target-indices-after-sorting-array/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis problem is a useful bridge between sorting and binary search. After sorting the array, all copies of \u003ccode\u003etarget\u003c/code\u003e become one contiguous block, and the answer is simply every index inside that block.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003esorting\u003c/code\u003e, \u003ccode\u003ebinary search\u003c/code\u003e, \u003ccode\u003erange location\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Find Target Indices After Sorting Array, LeetCode 2089, lower bound, upper bound\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Sort the array, use lower and upper bounds to find the target block, and return every matching index, with tradeoffs, engineering scenarios, and runnable implementations in six languages.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners connecting sorting with lower/upper bound search\u003c/li\u003e\n\u003cli\u003eEngineers who need all positions of one value after offline sorting\u003c/li\u003e\n\u003cli\u003eInterview candidates reviewing how contiguous blocks form in sorted data\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThe input array is not sorted, so we cannot apply binary search immediately.\u003cbr\u003e\nBut once we sort it, every copy of the same value becomes one continuous segment.\u003c/p\u003e","title":"LeetCode 2089: Find Target Indices After Sorting Array ACERS Guide"},{"content":" Subtitle / Summary\nThis problem is a compact exercise in boundary counting. Because the array is already sorted, you do not count negatives and positives one by one; you find where zero starts and where zero ends, then compute both counts from those boundaries.\nReading time: 10-12 min Tags: binary search, counting, sorted array, boundaries SEO keywords: Maximum Count of Positive Integer and Negative Integer, LeetCode 2529, boundary counting Meta description: Use lower-bound and upper-bound binary search around zero to count negatives and positives in a sorted array, with correctness reasoning, engineering scenarios, and runnable implementations in six languages. Target Readers Learners practicing boundary search beyond exact-match lookup Engineers who count segments in sorted data Interview candidates learning how lower and upper bounds produce counts Background / Motivation The input is already sorted. That changes the problem completely.\nInstead of scanning the whole array and incrementing counters, we can ask:\nwhere do negative numbers stop? where do positive numbers start? Zeros act as the separator in the middle.\nSo this is really a boundary problem around the value 0.\nCore Concepts Negative count: number of values \u0026lt; 0 Positive count: number of values \u0026gt; 0 Lower bound of 0: first index where nums[i] \u0026gt;= 0 Upper bound of 0: first index where nums[i] \u0026gt; 0 From those:\nneg = lower_bound(0) pos = n - upper_bound(0) A - Algorithm Problem Restatement Given a sorted integer array nums that may contain negative numbers, zeros, and positive numbers:\nlet countNeg be the number of elements \u0026lt; 0 let countPos be the number of elements \u0026gt; 0 Return max(countNeg, countPos).\nInput / Output Name Type Meaning nums int[] sorted integer array return int larger of negative count and positive count Example 1 nums = [-3, -2, -1, 0, 0, 1, 2] output = 3 Example 2 nums = [-2, -1, -1, 1, 2, 3] output = 3 Example 3 nums = [0, 0, 0] output = 0 Thought Process: From Counting to Boundaries The direct idea is:\nscan the array count negatives count positives That is O(n), and for this problem it is actually acceptable.\nBut because the array is sorted, we can do better conceptually:\nnegatives are on the left zeros are in the middle positives are on the right So the counts are determined by two positions:\nthe first index \u0026gt;= 0 the first index \u0026gt; 0 C - Concepts Method Category Binary search Boundary counting Sorted partition lookup Why Two Searches Are Enough Let:\na = lower_bound(nums, 0) =\u0026gt; first non-negative index b = upper_bound(nums, 0) =\u0026gt; first positive index Then:\nall indices [0, a) are negative all indices [b, n) are positive So:\ncountNeg = a countPos = n - b Stable Algorithm compute neg = lower_bound(nums, 0) compute pos = len(nums) - upper_bound(nums, 0) return max(neg, pos) E - Engineering Scenario 1: Signed Score Analysis (Python) Background: a sorted score array contains losses, neutral events, and gains.\nWhy it fits: you only need the boundaries around zero, not a full scan for every query.\ndef lower_bound(nums, target): l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt;= target: r = mid else: l = mid + 1 return l def upper_bound(nums, target): l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt; target: r = mid else: l = mid + 1 return l nums = [-3, -2, -1, 0, 0, 1, 2] print(max(lower_bound(nums, 0), len(nums) - upper_bound(nums, 0))) Scenario 2: Sorted Risk Buckets (Go) Background: a risk engine stores sorted signed deviations and needs the dominant side.\nWhy it fits: zero is the natural split point.\npackage main import \u0026#34;fmt\u0026#34; func lowerBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt;= target { r = mid } else { l = mid + 1 } } return l } func upperBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt; target { r = mid } else { l = mid + 1 } } return l } func main() { nums := []int{-2, -1, -1, 1, 2, 3} neg := lowerBound(nums, 0) pos := len(nums) - upperBound(nums, 0) fmt.Println(max(neg, pos)) } func max(a, b int) int { if a \u0026gt; b { return a } return b } Scenario 3: Frontend Sorted Trend Display (JavaScript) Background: a UI shows sorted changes and wants to summarize whether negative or positive values dominate.\nWhy it fits: a boundary lookup is cheaper than repeated category checks when the list is reused.\nfunction lowerBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } function upperBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } const nums = [-3, -2, -1, 0, 0, 1, 2]; console.log(Math.max(lowerBound(nums, 0), nums.length - upperBound(nums, 0))); R - Reflection Complexity Time: O(log n) Space: O(1) Alternatives Linear scan: valid and simple, but it ignores the structural value of the sorted input Two pointers from both ends: unnecessary and harder to reason about than boundary search Common Mistakes Counting zeros as positive or negative Using only one boundary and trying to infer both counts from it Using \u0026gt;= 0 when you need strictly positive count Why This Method Is the Most Reusable This problem is really about extracting counts from sorted partitions.\nThat exact pattern appears in metric analysis, score bands, and threshold reporting, so lower and upper bounds are the right abstractions.\nS - Summary The sorted array splits naturally into negatives, zeros, and positives. lower_bound(0) gives the negative count. len(nums) - upper_bound(0) gives the positive count. Boundary search turns counting into a clean O(log n) partition lookup. Further Reading LeetCode 35: Search Insert Position LeetCode 34: Find First and Last Position of Element in Sorted Array Any standard lower_bound / upper_bound documentation Multi-language Implementations Python from typing import List def lower_bound(nums: List[int], target: int) -\u0026gt; int: l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt;= target: r = mid else: l = mid + 1 return l def upper_bound(nums: List[int], target: int) -\u0026gt; int: l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt; target: r = mid else: l = mid + 1 return l def maximum_count(nums: List[int]) -\u0026gt; int: neg = lower_bound(nums, 0) pos = len(nums) - upper_bound(nums, 0) return max(neg, pos) if __name__ == \u0026#34;__main__\u0026#34;: print(maximum_count([-3, -2, -1, 0, 0, 1, 2])) print(maximum_count([-2, -1, -1, 1, 2, 3])) print(maximum_count([0, 0, 0])) C #include \u0026lt;stdio.h\u0026gt; int lowerBound(int *nums, int n, int target) { int l = 0, r = n; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } int upperBound(int *nums, int n, int target) { int l = 0, r = n; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } int maximumCount(int *nums, int n) { int neg = lowerBound(nums, n, 0); int pos = n - upperBound(nums, n, 0); return neg \u0026gt; pos ? neg : pos; } int main(void) { int a[] = {-3, -2, -1, 0, 0, 1, 2}; int b[] = {-2, -1, -1, 1, 2, 3}; int c[] = {0, 0, 0}; printf(\u0026#34;%d\\n\u0026#34;, maximumCount(a, 7)); printf(\u0026#34;%d\\n\u0026#34;, maximumCount(b, 6)); printf(\u0026#34;%d\\n\u0026#34;, maximumCount(c, 3)); return 0; } C++ #include \u0026lt;algorithm\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; using namespace std; int maximumCount(const vector\u0026lt;int\u0026gt;\u0026amp; nums) { int neg = lower_bound(nums.begin(), nums.end(), 0) - nums.begin(); int pos = nums.end() - upper_bound(nums.begin(), nums.end(), 0); return max(neg, pos); } int main() { cout \u0026lt;\u0026lt; maximumCount({-3, -2, -1, 0, 0, 1, 2}) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; cout \u0026lt;\u0026lt; maximumCount({-2, -1, -1, 1, 2, 3}) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; cout \u0026lt;\u0026lt; maximumCount({0, 0, 0}) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } Go package main import \u0026#34;fmt\u0026#34; func lowerBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt;= target { r = mid } else { l = mid + 1 } } return l } func upperBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt; target { r = mid } else { l = mid + 1 } } return l } func maximumCount(nums []int) int { neg := lowerBound(nums, 0) pos := len(nums) - upperBound(nums, 0) if neg \u0026gt; pos { return neg } return pos } func main() { fmt.Println(maximumCount([]int{-3, -2, -1, 0, 0, 1, 2})) fmt.Println(maximumCount([]int{-2, -1, -1, 1, 2, 3})) fmt.Println(maximumCount([]int{0, 0, 0})) } Rust fn lower_bound(nums: \u0026amp;[i32], target: i32) -\u0026gt; usize { let (mut l, mut r) = (0usize, nums.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if nums[mid] \u0026gt;= target { r = mid; } else { l = mid + 1; } } l } fn upper_bound(nums: \u0026amp;[i32], target: i32) -\u0026gt; usize { let (mut l, mut r) = (0usize, nums.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if nums[mid] \u0026gt; target { r = mid; } else { l = mid + 1; } } l } fn maximum_count(nums: \u0026amp;[i32]) -\u0026gt; usize { let neg = lower_bound(nums, 0); let pos = nums.len() - upper_bound(nums, 0); neg.max(pos) } fn main() { println!(\u0026#34;{}\u0026#34;, maximum_count(\u0026amp;[-3, -2, -1, 0, 0, 1, 2])); println!(\u0026#34;{}\u0026#34;, maximum_count(\u0026amp;[-2, -1, -1, 1, 2, 3])); println!(\u0026#34;{}\u0026#34;, maximum_count(\u0026amp;[0, 0, 0])); } JavaScript function lowerBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } function upperBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } function maximumCount(nums) { const neg = lowerBound(nums, 0); const pos = nums.length - upperBound(nums, 0); return Math.max(neg, pos); } console.log(maximumCount([-3, -2, -1, 0, 0, 1, 2])); console.log(maximumCount([-2, -1, -1, 1, 2, 3])); console.log(maximumCount([0, 0, 0])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/binary-search/2529-maximum-count-of-positive-integer-and-negative-integer/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis problem is a compact exercise in boundary counting. Because the array is already sorted, you do not count negatives and positives one by one; you find where zero starts and where zero ends, then compute both counts from those boundaries.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ebinary search\u003c/code\u003e, \u003ccode\u003ecounting\u003c/code\u003e, \u003ccode\u003esorted array\u003c/code\u003e, \u003ccode\u003eboundaries\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Maximum Count of Positive Integer and Negative Integer, LeetCode 2529, boundary counting\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Use lower-bound and upper-bound binary search around zero to count negatives and positives in a sorted array, with correctness reasoning, engineering scenarios, and runnable implementations in six languages.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners practicing boundary search beyond exact-match lookup\u003c/li\u003e\n\u003cli\u003eEngineers who count segments in sorted data\u003c/li\u003e\n\u003cli\u003eInterview candidates learning how lower and upper bounds produce counts\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThe input is already sorted. That changes the problem completely.\u003c/p\u003e","title":"LeetCode 2529: Maximum Count of Positive Integer and Negative Integer ACERS Guide"},{"content":" Subtitle / Summary\nThis problem is the standard upgrade from “find one target” to “find the whole target block.” The clean solution is not a special-case binary search, but two boundary searches: lower bound for the start and upper bound for the end.\nReading time: 12-14 min Tags: binary search, lower bound, upper bound, range query SEO keywords: Search Range, lower bound, upper bound, LeetCode 34 Meta description: Use lower-bound and upper-bound binary search to find the first and last positions of a target in a sorted array, with pitfalls, engineering scenarios, and runnable implementations in six languages. Target Readers Learners who already know basic binary search but struggle with boundary problems Engineers who query sorted logs, timestamps, or grouped IDs Interview candidates who want one reusable range-search template Background / Motivation Finding a single target in a sorted array is the easy version.\nReal systems often need the full range:\nall log entries at the same timestamp all records with the same sorted key all metrics points equal to one threshold So the real question becomes:\nwhere does the target block start? where does the target block end? That is why this problem is best understood as a combination of:\nlower_bound(target) upper_bound(target) Core Concepts Lower bound: first index i such that nums[i] \u0026gt;= target Upper bound: first index i such that nums[i] \u0026gt; target Target range: if the target exists, the answer is start = lower_bound(target) end = upper_bound(target) - 1 A - Algorithm Problem Restatement Given a non-decreasing integer array nums and an integer target, return:\n[start, end] if target appears in the array [-1, -1] otherwise The required time complexity is O(log n).\nInput / Output Name Type Meaning nums int[] sorted array in non-decreasing order target int value to locate return int[] [start, end] or [-1, -1] Example 1 nums = [5, 7, 7, 8, 8, 10] target = 8 output = [3, 4] Example 2 nums = [5, 7, 7, 8, 8, 10] target = 6 output = [-1, -1] Example 3 nums = [] target = 0 output = [-1, -1] Thought Process: From Scan to Two Boundaries The naive idea is:\nscan until you see target keep moving until the target block ends That works, but costs O(n).\nThe array is sorted, so the target values form one continuous block if they exist.\nThat means the answer can be described by two monotonic boundaries:\nfirst position with value \u0026gt;= target first position with value \u0026gt; target So instead of trying to invent one tricky binary search, run two simple boundary searches.\nC - Concepts Method Category Binary search Boundary search Range query on sorted data Correctness Logic Let:\nl = lower_bound(target) r = upper_bound(target) Then:\nif l == len(nums) or nums[l] != target, the target does not exist otherwise the target occupies indices [l, r - 1] Stable Algorithm compute l = lower_bound(nums, target) if l is out of range or nums[l] != target, return [-1, -1] compute r = upper_bound(nums, target) return [l, r - 1] Why This Is Better Than “Find Any Equal First” If you stop when nums[mid] == target, you still do not know:\nwhether there is another target on the left whether there is another target on the right Boundary search solves the actual problem directly.\nE - Engineering Scenario 1: Querying Timestamp Blocks (Python) Background: logs are sorted by timestamp or event ID.\nWhy it fits: all matching records form one contiguous interval.\ndef lower_bound(nums, target): l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt;= target: r = mid else: l = mid + 1 return l def upper_bound(nums, target): l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt; target: r = mid else: l = mid + 1 return l times = [10, 10, 10, 13, 13, 20] left = lower_bound(times, 10) right = upper_bound(times, 10) - 1 print(left, right) Scenario 2: Sorted Order Buckets (Go) Background: a service stores sorted group IDs and needs the full range of one ID.\nWhy it fits: equal IDs appear in one block after sorting.\npackage main import \u0026#34;fmt\u0026#34; func lowerBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt;= target { r = mid } else { l = mid + 1 } } return l } func upperBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt; target { r = mid } else { l = mid + 1 } } return l } func main() { nums := []int{5, 7, 7, 8, 8, 10} l := lowerBound(nums, 8) r := upperBound(nums, 8) - 1 fmt.Println(l, r) } Scenario 3: Highlighting Duplicate Segments in a UI (JavaScript) Background: a frontend receives a sorted list and wants to highlight all matching values.\nWhy it fits: the UI needs a start index and an end index, not one arbitrary match.\nfunction lowerBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } function upperBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } const nums = [5, 7, 7, 8, 8, 10]; console.log([lowerBound(nums, 8), upperBound(nums, 8) - 1]); // [3, 4] R - Reflection Complexity Time: O(log n) because we run two binary searches Space: O(1) Alternatives Linear scan: easy, but O(n) Find one target and expand outward: still degrades to O(n) when many duplicates exist Common Mistakes Using \u0026gt;= in both helpers, which makes lower and upper bounds identical Forgetting to verify nums[l] == target before returning a range Returning [l, r] instead of [l, r - 1] Why This Method Is the Most Practical The target block is defined by two boundaries, so two boundary searches are the most direct, readable, and reusable solution.\nThis is exactly the form engineers use in sorted logs, metrics series, and key-index tables.\nS - Summary Search Range is a two-boundary problem, not a “find one match” problem. lower_bound gives the start and upper_bound - 1 gives the end. Verifying nums[l] == target is what separates “found” from “not found.” This template generalizes to many sorted-data range queries in real systems. Further Reading LeetCode 35: Search Insert Position LeetCode 744: Find Smallest Letter Greater Than Target Boundary-search utilities such as bisect_left, bisect_right, lower_bound, and upper_bound Multi-language Implementations Python from typing import List def lower_bound(nums: List[int], target: int) -\u0026gt; int: l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt;= target: r = mid else: l = mid + 1 return l def upper_bound(nums: List[int], target: int) -\u0026gt; int: l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt; target: r = mid else: l = mid + 1 return l def search_range(nums: List[int], target: int) -\u0026gt; List[int]: left = lower_bound(nums, target) if left == len(nums) or nums[left] != target: return [-1, -1] right = upper_bound(nums, target) - 1 return [left, right] if __name__ == \u0026#34;__main__\u0026#34;: print(search_range([5, 7, 7, 8, 8, 10], 8)) print(search_range([5, 7, 7, 8, 8, 10], 6)) print(search_range([], 0)) C #include \u0026lt;stdio.h\u0026gt; int lowerBound(int *nums, int n, int target) { int l = 0, r = n; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } int upperBound(int *nums, int n, int target) { int l = 0, r = n; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } void searchRange(int *nums, int n, int target, int ans[2]) { int left = lowerBound(nums, n, target); if (left == n || nums[left] != target) { ans[0] = -1; ans[1] = -1; return; } ans[0] = left; ans[1] = upperBound(nums, n, target) - 1; } int main(void) { int nums[] = {5, 7, 7, 8, 8, 10}; int ans[2]; searchRange(nums, 6, 8, ans); printf(\u0026#34;[%d, %d]\\n\u0026#34;, ans[0], ans[1]); searchRange(nums, 6, 6, ans); printf(\u0026#34;[%d, %d]\\n\u0026#34;, ans[0], ans[1]); return 0; } C++ #include \u0026lt;algorithm\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; using namespace std; vector\u0026lt;int\u0026gt; searchRange(const vector\u0026lt;int\u0026gt;\u0026amp; nums, int target) { auto left = lower_bound(nums.begin(), nums.end(), target); if (left == nums.end() || *left != target) { return {-1, -1}; } auto right = upper_bound(nums.begin(), nums.end(), target); return {(int)(left - nums.begin()), (int)(right - nums.begin() - 1)}; } int main() { vector\u0026lt;int\u0026gt; nums{5, 7, 7, 8, 8, 10}; auto a = searchRange(nums, 8); auto b = searchRange(nums, 6); cout \u0026lt;\u0026lt; \u0026#34;[\u0026#34; \u0026lt;\u0026lt; a[0] \u0026lt;\u0026lt; \u0026#34;, \u0026#34; \u0026lt;\u0026lt; a[1] \u0026lt;\u0026lt; \u0026#34;]\\n\u0026#34;; cout \u0026lt;\u0026lt; \u0026#34;[\u0026#34; \u0026lt;\u0026lt; b[0] \u0026lt;\u0026lt; \u0026#34;, \u0026#34; \u0026lt;\u0026lt; b[1] \u0026lt;\u0026lt; \u0026#34;]\\n\u0026#34;; return 0; } Go package main import \u0026#34;fmt\u0026#34; func lowerBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt;= target { r = mid } else { l = mid + 1 } } return l } func upperBound(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt; target { r = mid } else { l = mid + 1 } } return l } func searchRange(nums []int, target int) []int { left := lowerBound(nums, target) if left == len(nums) || nums[left] != target { return []int{-1, -1} } return []int{left, upperBound(nums, target) - 1} } func main() { fmt.Println(searchRange([]int{5, 7, 7, 8, 8, 10}, 8)) fmt.Println(searchRange([]int{5, 7, 7, 8, 8, 10}, 6)) } Rust fn lower_bound(nums: \u0026amp;[i32], target: i32) -\u0026gt; usize { let (mut l, mut r) = (0usize, nums.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if nums[mid] \u0026gt;= target { r = mid; } else { l = mid + 1; } } l } fn upper_bound(nums: \u0026amp;[i32], target: i32) -\u0026gt; usize { let (mut l, mut r) = (0usize, nums.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if nums[mid] \u0026gt; target { r = mid; } else { l = mid + 1; } } l } fn search_range(nums: \u0026amp;[i32], target: i32) -\u0026gt; [i32; 2] { let left = lower_bound(nums, target); if left == nums.len() || nums[left] != target { return [-1, -1]; } [left as i32, (upper_bound(nums, target) - 1) as i32] } fn main() { let nums = vec![5, 7, 7, 8, 8, 10]; println!(\u0026#34;{:?}\u0026#34;, search_range(\u0026amp;nums, 8)); println!(\u0026#34;{:?}\u0026#34;, search_range(\u0026amp;nums, 6)); } JavaScript function lowerBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } function upperBound(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l; } function searchRange(nums, target) { const left = lowerBound(nums, target); if (left === nums.length || nums[left] !== target) { return [-1, -1]; } return [left, upperBound(nums, target) - 1]; } console.log(searchRange([5, 7, 7, 8, 8, 10], 8)); console.log(searchRange([5, 7, 7, 8, 8, 10], 6)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/binary-search/34-find-first-and-last-position-of-element-in-sorted-array/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis problem is the standard upgrade from “find one target” to “find the whole target block.” The clean solution is not a special-case binary search, but two boundary searches: lower bound for the start and upper bound for the end.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-14 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ebinary search\u003c/code\u003e, \u003ccode\u003elower bound\u003c/code\u003e, \u003ccode\u003eupper bound\u003c/code\u003e, \u003ccode\u003erange query\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Search Range, lower bound, upper bound, LeetCode 34\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Use lower-bound and upper-bound binary search to find the first and last positions of a target in a sorted array, with pitfalls, engineering scenarios, and runnable implementations in six languages.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners who already know basic binary search but struggle with boundary problems\u003c/li\u003e\n\u003cli\u003eEngineers who query sorted logs, timestamps, or grouped IDs\u003c/li\u003e\n\u003cli\u003eInterview candidates who want one reusable range-search template\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eFinding a single target in a sorted array is the easy version.\u003cbr\u003e\nReal systems often need the full range:\u003c/p\u003e","title":"LeetCode 34: Find First and Last Position of Element in Sorted Array ACERS Guide"},{"content":" Subtitle / Summary\nSearch Insert Position is the cleanest lower-bound problem in LeetCode. If you can reliably find the first index where nums[i] \u0026gt;= target, you already have the core template for insert positions, range starts, and many boundary-search problems.\nReading time: 10-12 min Tags: binary search, lower bound, sorted array SEO keywords: Search Insert Position, lower bound, binary search, LeetCode 35 Meta description: Lower-bound binary search for Search Insert Position, with boundary reasoning, pitfalls, engineering scenarios, and runnable implementations in Python, C, C++, Go, Rust, and JavaScript. Target Readers Learners who know basic binary search but still hesitate on boundary handling Engineers who insert or locate values in sorted tables Interview candidates who want one reusable lower-bound template Background / Motivation This problem looks simple because the output is a single index.\nThe real lesson is deeper:\nif the target exists, return its index if it does not exist, return the insertion position Those two requirements can be unified into one question:\nWhat is the first index whose value is greater than or equal to target?\nThat question is exactly lower_bound.\nOnce that idea is stable, a large family of binary-search problems becomes easier:\ninsert position first occurrence range start counting \u0026lt; target or \u0026gt;= target Core Concepts Sorted array: binary search only works because order gives a monotonic decision rule Lower bound: the first index i such that nums[i] \u0026gt;= target Half-open interval: using [l, r) keeps the code concise and avoids fencepost bugs Monotonic predicate: nums[i] \u0026gt;= target is false ... false, true ... true A - Algorithm Problem Restatement Given a non-decreasing integer array nums and an integer target:\nreturn the index of target if it exists otherwise return the index where target should be inserted to keep the array sorted The required time complexity is O(log n).\nInput / Output Name Type Meaning nums int[] sorted array in non-decreasing order target int value to search or insert return int existing index or insertion index Example 1 nums = [1, 3, 5, 6] target = 5 output = 2 Example 2 nums = [1, 3, 5, 6] target = 2 output = 1 Example 3 nums = [1, 3, 5, 6] target = 7 output = 4 Example 4 nums = [1, 3, 5, 6] target = 0 output = 0 Thought Process: From Linear Scan to Lower Bound The brute-force idea is simple:\nscan from left to right stop at the first value \u0026gt;= target return that index if nothing qualifies, return n That works, but it costs O(n).\nBecause the array is already sorted, the condition\nnums[i] \u0026gt;= target forms a monotonic pattern:\nfalse false false true true true Binary search is exactly the tool for finding the first true.\nC - Concepts Method Category Binary search Boundary search Lower-bound template Why the Same Index Solves Both Cases If target exists, the first position with nums[i] \u0026gt;= target is the first position where nums[i] == target.\nIf target does not exist, the first position with nums[i] \u0026gt;= target is the place where target must be inserted.\nSo one return value handles both outcomes.\nStable Template Use a half-open interval [l, r):\ninitialize l = 0, r = len(nums) while l \u0026lt; r: let mid = l + (r - l) // 2 if nums[mid] \u0026gt;= target, keep the answer on the left: r = mid otherwise move right: l = mid + 1 return l Why [l, r) Is Convenient r can safely start at len(nums) the insertion-at-end case naturally returns len(nums) the loop invariant is easy: the answer always stays inside [l, r) E - Engineering Scenario 1: Threshold Tables in Backend Services (Python) Background: a service stores sorted thresholds such as latency buckets or score cutoffs.\nWhy it fits: you need the first threshold that is not smaller than the incoming value.\ndef search_insert(nums, target): l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt;= target: r = mid else: l = mid + 1 return l thresholds = [10, 30, 60, 100] for x in [5, 10, 25, 70, 101]: print(x, \u0026#34;-\u0026gt; slot\u0026#34;, search_insert(thresholds, x)) Scenario 2: Pricing or Risk Tiers (Go) Background: an order amount must be mapped into the correct sorted tier table.\nWhy it fits: insert position is exactly the tier index.\npackage main import \u0026#34;fmt\u0026#34; func searchInsert(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt;= target { r = mid } else { l = mid + 1 } } return l } func main() { tiers := []int{1000, 5000, 10000, 50000} for _, amount := range []int{500, 1000, 2000, 20000} { fmt.Println(amount, \u0026#34;-\u0026gt; tier\u0026#34;, searchInsert(tiers, amount)) } } Scenario 3: Frontend Timeline or Version Selection (JavaScript) Background: a UI highlights the first version not smaller than the current version.\nWhy it fits: the highlighted node is a lower-bound lookup.\nfunction searchInsert(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt;= target) r = mid; else l = mid + 1; } return l; } console.log(searchInsert([1, 3, 5, 6], 5)); // 2 console.log(searchInsert([1, 3, 5, 6], 4)); // 2 R - Reflection Complexity Time: O(log n) Space: O(1) Alternatives Linear scan: simpler, but O(n) and misses the point of the sorted input Library lower_bound / bisect_left: excellent in production, but you still need the boundary model to use them correctly Common Mistakes Using \u0026gt; instead of \u0026gt;=, which turns the answer into an upper bound Returning immediately when nums[mid] == target, which loses the insertion-position interpretation Mixing interval styles, for example using [l, r) updates with a [l, r] initialization Why This Is the Best Practical Method The array is sorted, the predicate is monotonic, and the return value is a boundary index.\nThat is exactly the shape binary search is designed for, so this is both the optimal asymptotic solution and the cleanest engineering template.\nS - Summary Search Insert Position is a pure lower-bound problem. The answer is the first index where nums[i] \u0026gt;= target. A half-open interval [l, r) makes the boundary logic stable. This template directly extends to range queries, counts, and insertion-point lookups in real systems. Further Reading LeetCode 34: Find First and Last Position of Element in Sorted Array LeetCode 744: Find Smallest Letter Greater Than Target Python bisect_left, C++ lower_bound, and Go sort.Search Multi-language Implementations Python from typing import List def search_insert(nums: List[int], target: int) -\u0026gt; int: l, r = 0, len(nums) while l \u0026lt; r: mid = (l + r) // 2 if nums[mid] \u0026gt;= target: r = mid else: l = mid + 1 return l if __name__ == \u0026#34;__main__\u0026#34;: print(search_insert([1, 3, 5, 6], 5)) # 2 print(search_insert([1, 3, 5, 6], 2)) # 1 print(search_insert([1, 3, 5, 6], 7)) # 4 print(search_insert([1, 3, 5, 6], 0)) # 0 C #include \u0026lt;stdio.h\u0026gt; int searchInsert(int *nums, int numsSize, int target) { int l = 0, r = numsSize; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt;= target) { r = mid; } else { l = mid + 1; } } return l; } int main(void) { int nums[] = {1, 3, 5, 6}; int n = sizeof(nums) / sizeof(nums[0]); printf(\u0026#34;%d\\n\u0026#34;, searchInsert(nums, n, 5)); printf(\u0026#34;%d\\n\u0026#34;, searchInsert(nums, n, 2)); printf(\u0026#34;%d\\n\u0026#34;, searchInsert(nums, n, 7)); printf(\u0026#34;%d\\n\u0026#34;, searchInsert(nums, n, 0)); return 0; } C++ #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; using namespace std; int searchInsert(const vector\u0026lt;int\u0026gt;\u0026amp; nums, int target) { int l = 0, r = (int)nums.size(); while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (nums[mid] \u0026gt;= target) { r = mid; } else { l = mid + 1; } } return l; } int main() { vector\u0026lt;int\u0026gt; nums{1, 3, 5, 6}; cout \u0026lt;\u0026lt; searchInsert(nums, 5) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; cout \u0026lt;\u0026lt; searchInsert(nums, 2) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; cout \u0026lt;\u0026lt; searchInsert(nums, 7) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; cout \u0026lt;\u0026lt; searchInsert(nums, 0) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } Go package main import \u0026#34;fmt\u0026#34; func searchInsert(nums []int, target int) int { l, r := 0, len(nums) for l \u0026lt; r { mid := l + (r-l)/2 if nums[mid] \u0026gt;= target { r = mid } else { l = mid + 1 } } return l } func main() { nums := []int{1, 3, 5, 6} fmt.Println(searchInsert(nums, 5)) fmt.Println(searchInsert(nums, 2)) fmt.Println(searchInsert(nums, 7)) fmt.Println(searchInsert(nums, 0)) } Rust fn search_insert(nums: \u0026amp;[i32], target: i32) -\u0026gt; usize { let (mut l, mut r) = (0usize, nums.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if nums[mid] \u0026gt;= target { r = mid; } else { l = mid + 1; } } l } fn main() { let nums = vec![1, 3, 5, 6]; println!(\u0026#34;{}\u0026#34;, search_insert(\u0026amp;nums, 5)); println!(\u0026#34;{}\u0026#34;, search_insert(\u0026amp;nums, 2)); println!(\u0026#34;{}\u0026#34;, search_insert(\u0026amp;nums, 7)); println!(\u0026#34;{}\u0026#34;, search_insert(\u0026amp;nums, 0)); } JavaScript function searchInsert(nums, target) { let l = 0, r = nums.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (nums[mid] \u0026gt;= target) { r = mid; } else { l = mid + 1; } } return l; } console.log(searchInsert([1, 3, 5, 6], 5)); console.log(searchInsert([1, 3, 5, 6], 2)); console.log(searchInsert([1, 3, 5, 6], 7)); console.log(searchInsert([1, 3, 5, 6], 0)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/binary-search/35-search-insert-position/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nSearch Insert Position is the cleanest lower-bound problem in LeetCode. If you can reliably find the first index where \u003ccode\u003enums[i] \u0026gt;= target\u003c/code\u003e, you already have the core template for insert positions, range starts, and many boundary-search problems.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ebinary search\u003c/code\u003e, \u003ccode\u003elower bound\u003c/code\u003e, \u003ccode\u003esorted array\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Search Insert Position, lower bound, binary search, LeetCode 35\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Lower-bound binary search for Search Insert Position, with boundary reasoning, pitfalls, engineering scenarios, and runnable implementations in Python, C, C++, Go, Rust, and JavaScript.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners who know basic binary search but still hesitate on boundary handling\u003c/li\u003e\n\u003cli\u003eEngineers who insert or locate values in sorted tables\u003c/li\u003e\n\u003cli\u003eInterview candidates who want one reusable lower-bound template\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThis problem looks simple because the output is a single index.\u003cbr\u003e\nThe real lesson is deeper:\u003c/p\u003e","title":"LeetCode 35: Search Insert Position Lower-Bound Binary Search ACERS Guide"},{"content":" Subtitle / Summary\nThis problem is a textbook upper-bound search with one extra twist: wrap-around. Once you can find the first character \u0026gt; target, the rest is just handling the “no answer inside the array” case by returning the first element.\nReading time: 10-12 min Tags: binary search, upper bound, characters, wrap-around SEO keywords: Find Smallest Letter Greater Than Target, upper bound, LeetCode 744 Meta description: Use upper-bound binary search and wrap-around handling to solve LeetCode 744, with correctness reasoning, pitfalls, engineering scenarios, and runnable code in six languages. Target Readers Learners who already know lower bound and want to master upper bound Engineers who search the next greater value in a sorted cyclic list Interview candidates practicing boundary-style binary search Background / Motivation At first glance, this looks like a character problem.\nIt is actually a boundary problem:\nfind the first element strictly greater than target if it does not exist, wrap to the first element That exact shape appears in many systems:\nnext version after the current version next shard or route after the current key next allowed symbol in a cyclic ordered set So the real concept is not “letters.”\nIt is upper bound plus wrap-around.\nCore Concepts Upper bound: first index i such that letters[i] \u0026gt; target Wrap-around: if no such index exists, answer letters[0] Sorted array: gives the monotonic rule required by binary search A - Algorithm Problem Restatement You are given a sorted list of lowercase letters letters and a target character target.\nReturn the smallest character in letters that is strictly greater than target.\nThe array is considered circular, so:\nif every character is \u0026lt;= target return the first character in the array Input / Output Name Type Meaning letters char[] sorted lowercase letters target char current character return char smallest letter strictly greater than target Example 1 letters = [\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;] target = \u0026#39;a\u0026#39; output = \u0026#39;c\u0026#39; Example 2 letters = [\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;] target = \u0026#39;c\u0026#39; output = \u0026#39;f\u0026#39; Example 3 letters = [\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;] target = \u0026#39;j\u0026#39; output = \u0026#39;c\u0026#39; Thought Process: From Scan to Upper Bound The direct approach is:\nscan left to right return the first character \u0026gt; target if none exists, return the first character That is O(n).\nBecause letters is sorted, the predicate\nletters[i] \u0026gt; target changes from false to true exactly once.\nThat means binary search can find the first true in O(log n).\nC - Concepts Method Category Binary search Upper-bound boundary search Circular fallback logic Why Strictly Greater Matters This problem is not asking for:\nfirst \u0026gt;= target It asks for:\nfirst \u0026gt; target That one symbol difference decides whether you need lower bound or upper bound.\nStable Algorithm run upper-bound binary search to find the first index where letters[i] \u0026gt; target if the index is inside the array, return letters[index] otherwise return letters[0] Boundary Invariant Using [l, r):\nif letters[mid] \u0026gt; target, keep the answer in the left half otherwise move right At the end, l is the first valid index, or len(letters) if no valid index exists.\nE - Engineering Scenario 1: Next Version Selection (Python) Background: versions are kept in sorted order and you need the next greater version marker.\nWhy it fits: the answer is an upper bound in a cyclic list.\ndef next_greater(items, target): l, r = 0, len(items) while l \u0026lt; r: mid = (l + r) // 2 if items[mid] \u0026gt; target: r = mid else: l = mid + 1 return items[l] if l \u0026lt; len(items) else items[0] print(next_greater([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;a\u0026#34;)) print(next_greater([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;j\u0026#34;)) Scenario 2: Sorted Cyclic Routing Table (Go) Background: routes are stored in sorted order and requests move to the next available slot.\nWhy it fits: if the current route is at the end, the system wraps to the first slot.\npackage main import \u0026#34;fmt\u0026#34; func nextGreater(items []byte, target byte) byte { l, r := 0, len(items) for l \u0026lt; r { mid := l + (r-l)/2 if items[mid] \u0026gt; target { r = mid } else { l = mid + 1 } } if l \u0026lt; len(items) { return items[l] } return items[0] } func main() { fmt.Printf(\u0026#34;%c\\n\u0026#34;, nextGreater([]byte{\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;}, \u0026#39;c\u0026#39;)) fmt.Printf(\u0026#34;%c\\n\u0026#34;, nextGreater([]byte{\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;}, \u0026#39;j\u0026#39;)) } Scenario 3: Frontend Keyboard Hinting (JavaScript) Background: a UI suggests the next available sorted symbol after the current input.\nWhy it fits: strict-next plus wrap-around is the same model.\nfunction nextGreatestLetter(letters, target) { let l = 0, r = letters.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (letters[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l \u0026lt; letters.length ? letters[l] : letters[0]; } console.log(nextGreatestLetter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;c\u0026#34;)); // f console.log(nextGreatestLetter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;j\u0026#34;)); // c R - Reflection Complexity Time: O(log n) Space: O(1) Alternatives Linear scan: easy, but slower on large arrays Modulo tricks without binary search: not useful because the key challenge is still finding the upper bound Common Mistakes Using \u0026gt;= instead of \u0026gt;, which returns the wrong answer when target exists Forgetting the wrap-around case Returning letters[l - 1] or another neighbor without proving it Why This Is the Best Practical Method The array is sorted, the condition is monotonic, and the fallback is simple.\nThat makes upper-bound binary search plus one wrap-around check the cleanest and most reusable solution.\nS - Summary This is a pure upper-bound problem with a cyclic fallback. The correct search condition is letters[i] \u0026gt; target, not \u0026gt;=. If no valid index exists, the answer is the first element. The same pattern appears in version routing, cyclic lookup, and next-greater selection problems. Further Reading LeetCode 35: Search Insert Position LeetCode 34: Find First and Last Position of Element in Sorted Array Standard-library upper-bound helpers such as bisect_right and upper_bound Multi-language Implementations Python from typing import List def next_greatest_letter(letters: List[str], target: str) -\u0026gt; str: l, r = 0, len(letters) while l \u0026lt; r: mid = (l + r) // 2 if letters[mid] \u0026gt; target: r = mid else: l = mid + 1 return letters[l] if l \u0026lt; len(letters) else letters[0] if __name__ == \u0026#34;__main__\u0026#34;: print(next_greatest_letter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;a\u0026#34;)) print(next_greatest_letter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;c\u0026#34;)) print(next_greatest_letter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;j\u0026#34;)) C #include \u0026lt;stdio.h\u0026gt; char nextGreatestLetter(char *letters, int n, char target) { int l = 0, r = n; while (l \u0026lt; r) { int mid = l + (r - l) / 2; if (letters[mid] \u0026gt; target) { r = mid; } else { l = mid + 1; } } return l \u0026lt; n ? letters[l] : letters[0]; } int main(void) { char letters[] = {\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;}; printf(\u0026#34;%c\\n\u0026#34;, nextGreatestLetter(letters, 3, \u0026#39;a\u0026#39;)); printf(\u0026#34;%c\\n\u0026#34;, nextGreatestLetter(letters, 3, \u0026#39;c\u0026#39;)); printf(\u0026#34;%c\\n\u0026#34;, nextGreatestLetter(letters, 3, \u0026#39;j\u0026#39;)); return 0; } C++ #include \u0026lt;algorithm\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; using namespace std; char nextGreatestLetter(const vector\u0026lt;char\u0026gt;\u0026amp; letters, char target) { auto it = upper_bound(letters.begin(), letters.end(), target); return it == letters.end() ? letters[0] : *it; } int main() { vector\u0026lt;char\u0026gt; letters{\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;}; cout \u0026lt;\u0026lt; nextGreatestLetter(letters, \u0026#39;a\u0026#39;) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; cout \u0026lt;\u0026lt; nextGreatestLetter(letters, \u0026#39;c\u0026#39;) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; cout \u0026lt;\u0026lt; nextGreatestLetter(letters, \u0026#39;j\u0026#39;) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } Go package main import \u0026#34;fmt\u0026#34; func nextGreatestLetter(letters []byte, target byte) byte { l, r := 0, len(letters) for l \u0026lt; r { mid := l + (r-l)/2 if letters[mid] \u0026gt; target { r = mid } else { l = mid + 1 } } if l \u0026lt; len(letters) { return letters[l] } return letters[0] } func main() { fmt.Printf(\u0026#34;%c\\n\u0026#34;, nextGreatestLetter([]byte{\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;}, \u0026#39;a\u0026#39;)) fmt.Printf(\u0026#34;%c\\n\u0026#34;, nextGreatestLetter([]byte{\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;}, \u0026#39;c\u0026#39;)) fmt.Printf(\u0026#34;%c\\n\u0026#34;, nextGreatestLetter([]byte{\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;}, \u0026#39;j\u0026#39;)) } Rust fn next_greatest_letter(letters: \u0026amp;[char], target: char) -\u0026gt; char { let (mut l, mut r) = (0usize, letters.len()); while l \u0026lt; r { let mid = l + (r - l) / 2; if letters[mid] \u0026gt; target { r = mid; } else { l = mid + 1; } } if l \u0026lt; letters.len() { letters[l] } else { letters[0] } } fn main() { let letters = vec![\u0026#39;c\u0026#39;, \u0026#39;f\u0026#39;, \u0026#39;j\u0026#39;]; println!(\u0026#34;{}\u0026#34;, next_greatest_letter(\u0026amp;letters, \u0026#39;a\u0026#39;)); println!(\u0026#34;{}\u0026#34;, next_greatest_letter(\u0026amp;letters, \u0026#39;c\u0026#39;)); println!(\u0026#34;{}\u0026#34;, next_greatest_letter(\u0026amp;letters, \u0026#39;j\u0026#39;)); } JavaScript function nextGreatestLetter(letters, target) { let l = 0, r = letters.length; while (l \u0026lt; r) { const mid = (l + r) \u0026gt;\u0026gt; 1; if (letters[mid] \u0026gt; target) r = mid; else l = mid + 1; } return l \u0026lt; letters.length ? letters[l] : letters[0]; } console.log(nextGreatestLetter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;a\u0026#34;)); console.log(nextGreatestLetter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;c\u0026#34;)); console.log(nextGreatestLetter([\u0026#34;c\u0026#34;, \u0026#34;f\u0026#34;, \u0026#34;j\u0026#34;], \u0026#34;j\u0026#34;)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/binary-search/744-find-smallest-letter-greater-than-target/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis problem is a textbook upper-bound search with one extra twist: wrap-around. Once you can find the first character \u003ccode\u003e\u0026gt; target\u003c/code\u003e, the rest is just handling the “no answer inside the array” case by returning the first element.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ebinary search\u003c/code\u003e, \u003ccode\u003eupper bound\u003c/code\u003e, \u003ccode\u003echaracters\u003c/code\u003e, \u003ccode\u003ewrap-around\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Find Smallest Letter Greater Than Target, upper bound, LeetCode 744\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Use upper-bound binary search and wrap-around handling to solve LeetCode 744, with correctness reasoning, pitfalls, engineering scenarios, and runnable code in six languages.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners who already know lower bound and want to master upper bound\u003c/li\u003e\n\u003cli\u003eEngineers who search the next greater value in a sorted cyclic list\u003c/li\u003e\n\u003cli\u003eInterview candidates practicing boundary-style binary search\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eAt first glance, this looks like a character problem.\u003cbr\u003e\nIt is actually a boundary problem:\u003c/p\u003e","title":"LeetCode 744: Find Smallest Letter Greater Than Target Upper-Bound ACERS Guide"},{"content":" Subtitle / Summary\nLevel order traversal is the entry point of the binary-tree BFS template. The real key is not merely \u0026ldquo;use a queue\u0026rdquo;, but \u0026ldquo;separate one level from the next correctly\u0026rdquo;. This ACERS guide explains the level-size pattern, the DFS depth-bucket alternative, and engineering situations where grouped-by-depth traversal is useful.\nReading time: 10-12 min Tags: Hot100, binary tree, BFS, DFS, queue, level order traversal SEO keywords: Hot100, Binary Tree Level Order Traversal, BFS, queue, level order traversal, LeetCode 102 Meta description: A systematic guide to LeetCode 102 from level-by-level BFS to DFS depth buckets, with engineering scenarios and runnable multi-language implementations. Target Readers Hot100 learners who want to make the BFS tree template stable Developers who can traverse a tree but still mix up current-level and next-level boundaries Engineers who need to group tree-shaped data by depth for display or execution Background / Motivation LeetCode 102 is one of the most standard tree-BFS starter problems.\nWhat it really trains is not just \u0026ldquo;visit all nodes\u0026rdquo;, but two more important skills:\nuse a queue to maintain the next batch of nodes to process separate the current level from the next level cleanly Many BFS bugs come exactly from this boundary issue:\nusing the changing queue.length directly while iterating the current level mixing newly pushed children into the current level\u0026rsquo;s answer forgetting the empty-tree check and touching null immediately If you stabilize the 102 template, later problems like:\nright side view average of levels zigzag level order traversal minimum depth or maximum depth via BFS become much easier.\nCore Concepts Level order traversal: visit nodes level by level from top to bottom and left to right BFS (breadth-first search): process the current layer first, then expand the next layer Level-size snapshot: record the queue length before processing a level; that number is exactly how many nodes belong to this level Depth bucket: the DFS alternative, where values are stored in res[depth] A - Algorithm (Problem and Algorithm) Problem Restatement Given the root node root of a binary tree, return the level order traversal of its node values.\nIn other words, return the values level by level from left to right.\nInput / Output Name Type Description root TreeNode root of the binary tree, may be null return List[List[int]] node values grouped by level Example 1 input: root = [3,9,20,null,null,15,7] output: [[3],[9,20],[15,7]] explanation: level 1 -\u0026gt; [3] level 2 -\u0026gt; [9,20] level 3 -\u0026gt; [15,7] Example 2 input: root = [1] output: [[1]] Example 3 input: root = [] output: [] Constraints The number of nodes is in the range [0, 2000] -1000 \u0026lt;= Node.val \u0026lt;= 1000 C - Concepts (Core Ideas) Thought Process: The key is not the queue itself, but the level boundary If the task were only \u0026ldquo;visit every node\u0026rdquo;, ordinary BFS would be enough.\nBut this problem asks for a grouped result like [[level1], [level2], ...], so you must know:\nwhich nodes popped from the queue belong to the current level which children should be saved for the next round The most stable pattern is:\nbefore processing a level, record level_size = len(queue) pop exactly level_size nodes place the values of those nodes into the same level array push their children into the queue for the next round Why we must record level_size first Because while you process the current level, you keep pushing children from the next level into the same queue.\nIf you use the changing queue length directly as the loop condition, the current level and the next level will get mixed together.\nMethod Category BFS with queue Level grouping DFS with depth buckets (alternative) Why DFS can also work If you use DFS, carry the current depth depth in the recursive call:\nif depth == len(res), create a new bucket for that level append the current value to res[depth] That also produces a grouped-by-level result, although BFS is the more direct first choice for this problem.\nPractice Guide / Steps Recommended Approach: Level-by-level BFS Return [] immediately if the root is null Initialize a queue with the root At the start of each round, record level_size Pop exactly level_size nodes and collect their values Push their children into the queue Append the finished level array to the answer Runnable Python example:\nfrom collections import deque class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def level_order(root): if root is None: return [] ans = [] q = deque([root]) while q: level_size = len(q) level = [] for _ in range(level_size): node = q.popleft() level.append(node.val) if node.left is not None: q.append(node.left) if node.right is not None: q.append(node.right) ans.append(level) return ans if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(3, TreeNode(9), TreeNode(20, TreeNode(15), TreeNode(7))) print(level_order(root)) DFS Alternative If you want to practice the depth-bucket idea:\ncarry depth in recursion when depth == len(res), create a new level array append the current node value to res[depth] This is handy when the same traversal also needs other DFS-style statistics, but for LeetCode 102, BFS is more intuitive.\nE - Engineering (Real-world Scenarios) Scenario 1: Group an org chart by management level (Python) Background: reporting structures and org charts are naturally tree-shaped.\nWhy it fits: UI displays often need data grouped by CEO, VP, director, and manager layers.\nfrom collections import deque def group_by_level(root): if root is None: return [] q = deque([root]) ans = [] while q: level = [] for _ in range(len(q)): node = q.popleft() level.append(node[\u0026#34;name\u0026#34;]) for child in node.get(\u0026#34;children\u0026#34;, []): q.append(child) ans.append(level) return ans org = {\u0026#34;name\u0026#34;: \u0026#34;CEO\u0026#34;, \u0026#34;children\u0026#34;: [{\u0026#34;name\u0026#34;: \u0026#34;VP1\u0026#34;}, {\u0026#34;name\u0026#34;: \u0026#34;VP2\u0026#34;}]} print(group_by_level(org)) Scenario 2: Render a menu tree level by level (JavaScript) Background: admin menus and site navigation trees are often stored as hierarchical configs.\nWhy it fits: some interfaces progressively render or lazy-load one depth at a time to reduce initial complexity.\nfunction levelOrder(root) { if (!root) return []; const queue = [root]; const ans = []; while (queue.length) { const size = queue.length; const level = []; for (let i = 0; i \u0026lt; size; i += 1) { const node = queue.shift(); level.push(node.name); for (const child of node.children || []) queue.push(child); } ans.push(level); } return ans; } const menu = { name: \u0026#34;root\u0026#34;, children: [{ name: \u0026#34;docs\u0026#34;, children: [] }, { name: \u0026#34;blog\u0026#34;, children: [] }] }; console.log(levelOrder(menu)); Scenario 3: Execute tree-shaped tasks wave by wave (Go) Background: some workflow engines treat dependent tasks as a tree.\nWhy it fits: tasks at the same depth can be executed or inspected as one wave before moving to the next depth.\npackage main import \u0026#34;fmt\u0026#34; type Task struct { Name string Children []*Task } func waves(root *Task) [][]string { if root == nil { return [][]string{} } q := []*Task{root} ans := [][]string{} for len(q) \u0026gt; 0 { size := len(q) level := make([]string, 0, size) for i := 0; i \u0026lt; size; i++ { node := q[0] q = q[1:] level = append(level, node.Name) q = append(q, node.Children...) } ans = append(ans, level) } return ans } func main() { root := \u0026amp;Task{ Name: \u0026#34;build\u0026#34;, Children: []*Task{ {Name: \u0026#34;unit-test\u0026#34;}, {Name: \u0026#34;lint\u0026#34;}, }, } fmt.Println(waves(root)) } R - Reflection (Analysis and Deeper Understanding) Complexity Analysis Time complexity: O(n), because each node is pushed and popped once Space complexity: BFS: O(w), where w is the maximum width of the tree DFS depth buckets: O(h) recursion stack, plus the result array itself Alternative Approaches Method Time Extra Space Notes Level-by-level BFS O(n) O(w) Most natural and the recommended template DFS depth buckets O(n) O(h) Works well, but the level-order intuition is less direct Traverse first, regroup later O(n) Extra map or array Possible, but more indirect than grouping during traversal Common Mistakes and Pitfalls Not recording level_size first, so newly added children get mixed into the current level Forgetting to return an empty array when the root is null Defining the level array outside the outer loop and accidentally reusing the same array for all levels In JavaScript, iterating with a changing queue.length and breaking the level boundary Common Questions and Notes 1. Why must we record the queue length before each level starts? Because the queue changes during processing.\nOnly the original queue length at the start of the round tells you how many nodes belong to the current level.\n2. Does this problem have to be solved with BFS? No. DFS with depth tracking also works. But LeetCode 102 is the canonical BFS level-order template, so BFS is the best first answer.\n3. What should an empty tree return? Return [], not [[]].\nBest Practices and Suggestions For every \u0026ldquo;output by level\u0026rdquo; tree problem, think of the level_size template first Keep responsibilities clear: the queue stores nodes, while the level array stores values When practicing DFS, remember the trigger: depth == len(res) means create a new level Problems 102, 107, 199, and 637 make a good mini-series for BFS level-order variations S - Summary The heart of LeetCode 102 is not the queue alone, but the level boundary Recording level_size first is the most important stabilizing technique in the whole problem BFS is the primary template here, while DFS depth buckets are a strong alternative Any tree-shaped data that needs grouping by depth can reuse the same idea Once 102 is stable, the whole family of level-order problems becomes much easier References and Further Reading LeetCode 102: Binary Tree Level Order Traversal LeetCode 104: Maximum Depth of Binary Tree LeetCode 199: Binary Tree Right Side View LeetCode 637: Average of Levels in Binary Tree LeetCode 103: Binary Tree Zigzag Level Order Traversal CTA Practice 102, 107, and 199 together.\nThey are all variations of the same level-by-level BFS template. Only the output rule changes, which makes them ideal for locking in the queue-and-boundary pattern.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from collections import deque class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def level_order(root): if root is None: return [] ans = [] q = deque([root]) while q: level_size = len(q) level = [] for _ in range(level_size): node = q.popleft() level.append(node.val) if node.left is not None: q.append(node.left) if node.right is not None: q.append(node.right) ans.append(level) return ans if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(3, TreeNode(9), TreeNode(20, TreeNode(15), TreeNode(7))) print(level_order(root)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct TreeNode { int val; struct TreeNode* left; struct TreeNode* right; }; struct LevelOrderResult { int** levels; int* sizes; int count; }; struct TreeNode* new_node(int val) { struct TreeNode* node = (struct TreeNode*)malloc(sizeof(struct TreeNode)); node-\u0026gt;val = val; node-\u0026gt;left = NULL; node-\u0026gt;right = NULL; return node; } struct LevelOrderResult levelOrder(struct TreeNode* root) { struct LevelOrderResult res = {NULL, NULL, 0}; if (root == NULL) return res; struct TreeNode* queue[4096]; int front = 0; int back = 0; queue[back++] = root; res.levels = (int**)malloc(sizeof(int*) * 2048); res.sizes = (int*)malloc(sizeof(int) * 2048); while (front \u0026lt; back) { int levelSize = back - front; res.levels[res.count] = (int*)malloc(sizeof(int) * levelSize); res.sizes[res.count] = levelSize; for (int i = 0; i \u0026lt; levelSize; ++i) { struct TreeNode* node = queue[front++]; res.levels[res.count][i] = node-\u0026gt;val; if (node-\u0026gt;left) queue[back++] = node-\u0026gt;left; if (node-\u0026gt;right) queue[back++] = node-\u0026gt;right; } res.count++; } return res; } void print_result(struct LevelOrderResult* res) { printf(\u0026#34;[\u0026#34;); for (int i = 0; i \u0026lt; res-\u0026gt;count; ++i) { printf(\u0026#34;[\u0026#34;); for (int j = 0; j \u0026lt; res-\u0026gt;sizes[i]; ++j) { printf(\u0026#34;%d%s\u0026#34;, res-\u0026gt;levels[i][j], j + 1 == res-\u0026gt;sizes[i] ? \u0026#34;\u0026#34; : \u0026#34;,\u0026#34;); } printf(\u0026#34;]%s\u0026#34;, i + 1 == res-\u0026gt;count ? \u0026#34;\u0026#34; : \u0026#34;,\u0026#34;); } printf(\u0026#34;]\\n\u0026#34;); } void free_result(struct LevelOrderResult* res) { if (!res-\u0026gt;levels || !res-\u0026gt;sizes) return; for (int i = 0; i \u0026lt; res-\u0026gt;count; ++i) { free(res-\u0026gt;levels[i]); } free(res-\u0026gt;levels); free(res-\u0026gt;sizes); } void free_tree(struct TreeNode* root) { if (!root) return; free_tree(root-\u0026gt;left); free_tree(root-\u0026gt;right); free(root); } int main(void) { struct TreeNode* root = new_node(3); root-\u0026gt;left = new_node(9); root-\u0026gt;right = new_node(20); root-\u0026gt;right-\u0026gt;left = new_node(15); root-\u0026gt;right-\u0026gt;right = new_node(7); struct LevelOrderResult res = levelOrder(root); print_result(\u0026amp;res); free_result(\u0026amp;res); free_tree(root); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;queue\u0026gt; #include \u0026lt;vector\u0026gt; struct TreeNode { int val; TreeNode* left; TreeNode* right; explicit TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} }; std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt; levelOrder(TreeNode* root) { if (!root) return {}; std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt; ans; std::queue\u0026lt;TreeNode*\u0026gt; q; q.push(root); while (!q.empty()) { int size = static_cast\u0026lt;int\u0026gt;(q.size()); std::vector\u0026lt;int\u0026gt; level; for (int i = 0; i \u0026lt; size; ++i) { TreeNode* node = q.front(); q.pop(); level.push_back(node-\u0026gt;val); if (node-\u0026gt;left) q.push(node-\u0026gt;left); if (node-\u0026gt;right) q.push(node-\u0026gt;right); } ans.push_back(level); } return ans; } void freeTree(TreeNode* root) { if (!root) return; freeTree(root-\u0026gt;left); freeTree(root-\u0026gt;right); delete root; } int main() { TreeNode* root = new TreeNode(3); root-\u0026gt;left = new TreeNode(9); root-\u0026gt;right = new TreeNode(20); root-\u0026gt;right-\u0026gt;left = new TreeNode(15); root-\u0026gt;right-\u0026gt;right = new TreeNode(7); auto ans = levelOrder(root); std::cout \u0026lt;\u0026lt; \u0026#34;[\u0026#34;; for (size_t i = 0; i \u0026lt; ans.size(); ++i) { std::cout \u0026lt;\u0026lt; \u0026#34;[\u0026#34;; for (size_t j = 0; j \u0026lt; ans[i].size(); ++j) { std::cout \u0026lt;\u0026lt; ans[i][j] \u0026lt;\u0026lt; (j + 1 == ans[i].size() ? \u0026#34;\u0026#34; : \u0026#34;,\u0026#34;); } std::cout \u0026lt;\u0026lt; \u0026#34;]\u0026#34; \u0026lt;\u0026lt; (i + 1 == ans.size() ? \u0026#34;\u0026#34; : \u0026#34;,\u0026#34;); } std::cout \u0026lt;\u0026lt; \u0026#34;]\\n\u0026#34;; freeTree(root); return 0; } package main import \u0026#34;fmt\u0026#34; type TreeNode struct { Val int Left *TreeNode Right *TreeNode } func levelOrder(root *TreeNode) [][]int { if root == nil { return [][]int{} } q := []*TreeNode{root} ans := [][]int{} for len(q) \u0026gt; 0 { size := len(q) level := make([]int, 0, size) for i := 0; i \u0026lt; size; i++ { node := q[0] q = q[1:] level = append(level, node.Val) if node.Left != nil { q = append(q, node.Left) } if node.Right != nil { q = append(q, node.Right) } } ans = append(ans, level) } return ans } func main() { root := \u0026amp;TreeNode{ Val: 3, Left: \u0026amp;TreeNode{Val: 9}, Right: \u0026amp;TreeNode{Val: 20, Left: \u0026amp;TreeNode{Val: 15}, Right: \u0026amp;TreeNode{Val: 7}}, } fmt.Println(levelOrder(root)) } use std::cell::RefCell; use std::collections::VecDeque; use std::rc::Rc; type Node = Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;TreeNode\u0026gt;\u0026gt;\u0026gt;; #[derive(Debug, Clone)] struct TreeNode { val: i32, left: Node, right: Node, } impl TreeNode { fn new(val: i32) -\u0026gt; Rc\u0026lt;RefCell\u0026lt;TreeNode\u0026gt;\u0026gt; { Rc::new(RefCell::new(TreeNode { val, left: None, right: None, })) } } fn level_order(root: \u0026amp;Node) -\u0026gt; Vec\u0026lt;Vec\u0026lt;i32\u0026gt;\u0026gt; { let mut ans = Vec::new(); let mut q = VecDeque::new(); if let Some(node) = root { q.push_back(node.clone()); } else { return ans; } while !q.is_empty() { let size = q.len(); let mut level = Vec::with_capacity(size); for _ in 0..size { let node = q.pop_front().unwrap(); let node_ref = node.borrow(); level.push(node_ref.val); if let Some(left) = \u0026amp;node_ref.left { q.push_back(left.clone()); } if let Some(right) = \u0026amp;node_ref.right { q.push_back(right.clone()); } } ans.push(level); } ans } fn main() { let root = Some(TreeNode::new(3)); if let Some(node) = \u0026amp;root { node.borrow_mut().left = Some(TreeNode::new(9)); let right = Some(TreeNode::new(20)); node.borrow_mut().right = right.clone(); if let Some(r) = right { r.borrow_mut().left = Some(TreeNode::new(15)); r.borrow_mut().right = Some(TreeNode::new(7)); } } println!(\u0026#34;{:?}\u0026#34;, level_order(\u0026amp;root)); } function TreeNode(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function levelOrder(root) { if (root === null) return []; const queue = [root]; const ans = []; while (queue.length) { const size = queue.length; const level = []; for (let i = 0; i \u0026lt; size; i += 1) { const node = queue.shift(); level.push(node.val); if (node.left !== null) queue.push(node.left); if (node.right !== null) queue.push(node.right); } ans.push(level); } return ans; } const root = new TreeNode(3, new TreeNode(9), new TreeNode(20, new TreeNode(15), new TreeNode(7))); console.log(levelOrder(root)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/binary-tree/102-binary-tree-level-order-traversal/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nLevel order traversal is the entry point of the binary-tree BFS template. The real key is not merely \u0026ldquo;use a queue\u0026rdquo;, but \u0026ldquo;separate one level from the next correctly\u0026rdquo;. This ACERS guide explains the level-size pattern, the DFS depth-bucket alternative, and engineering situations where grouped-by-depth traversal is useful.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebinary tree\u003c/code\u003e, \u003ccode\u003eBFS\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e, \u003ccode\u003equeue\u003c/code\u003e, \u003ccode\u003elevel order traversal\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Binary Tree Level Order Traversal, BFS, queue, level order traversal, LeetCode 102\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A systematic guide to LeetCode 102 from level-by-level BFS to DFS depth buckets, with engineering scenarios and runnable multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want to make the BFS tree template stable\u003c/li\u003e\n\u003cli\u003eDevelopers who can traverse a tree but still mix up current-level and next-level boundaries\u003c/li\u003e\n\u003cli\u003eEngineers who need to group tree-shaped data by depth for display or execution\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eLeetCode 102 is one of the most standard tree-BFS starter problems.\u003c/p\u003e","title":"Hot100: Binary Tree Level Order Traversal (BFS / DFS ACERS Guide)"},{"content":" Subtitle / Summary\nThe hard part of Symmetric Tree is not traversal itself, but comparison direction. You are not comparing left to left and right to right. You are comparing mirrored positions. This ACERS guide explains the mirror-recursion contract, the BFS queue-of-pairs variant, and real engineering cases where symmetry checking matters.\nReading time: 10-12 min Tags: Hot100, binary tree, DFS, BFS, symmetry SEO keywords: Hot100, Symmetric Tree, mirror recursion, binary tree symmetry, BFS, LeetCode 101 Meta description: A systematic guide to LeetCode 101 from mirror recursion to pairwise BFS symmetry checks, with engineering scenarios and runnable multi-language implementations. Target Readers Hot100 learners moving from Same Tree to mirror comparison Developers who can write ordinary tree recursion but still mix up outside and inside pairs Engineers who need left-right symmetry validation for layouts, topology templates, or mirrored structures Background / Motivation LeetCode 101 is excellent training for directional thinking in tree problems:\nsymmetry does not mean the left and right subtrees are identical in the same direction it means the left side should match the right side after mirroring the comparison direction changes from \u0026ldquo;same direction\u0026rdquo; to \u0026ldquo;cross direction\u0026rdquo; Most mistakes fall into three groups:\nreusing the Same Tree logic and comparing left.left with right.left checking only node values while ignoring null positions flipping one subtree first, which adds an unnecessary transformation and makes reasoning harder What this problem really trains is the mirror-recursion template. Once that clicks, symmetry, mirror, and structure-matching questions become much easier.\nCore Concepts Mirror relation: left.left should match right.right, and left.right should match right.left Outside / inside pairing: compare the outer children together and the inner children together Pairwise recursion: the helper function answers whether two positions are mirror images Pairwise queueing: in BFS, the queue stores node pairs that must be checked together A - Algorithm (Problem and Algorithm) Problem Restatement Given the root node root of a binary tree, return true if the tree is symmetric around its center.\nA tree is symmetric if its left subtree and right subtree are mirror images of each other.\nInput / Output Name Type Description root TreeNode root of the binary tree return bool whether the tree is symmetric Example 1 input: root = [1,2,2,3,4,4,3] output: true explanation: The left subtree and the right subtree match exactly after mirroring. Example 2 input: root = [1,2,2,null,3,null,3] output: false explanation: The right child of the left subtree and the right child of the right subtree appear on the same side. That is not a mirror relation. Constraints The number of nodes is in the range [1, 1000] -100 \u0026lt;= Node.val \u0026lt;= 100 C - Concepts (Core Ideas) Thought Process: Symmetry means comparing mirrored positions Suppose you are comparing two nodes a and b. For them to be mirrors, the following must all hold:\nBoth are null: this mirrored position matches, so return true Exactly one is null: the structure is broken, so return false Values differ: the mirrored nodes do not match, so return false Both exist and values match: compare a.left with b.right compare a.right with b.left Written as a formula:\nmirror(a, b) = true, if a == null and b == null false, if exactly one is null false, if a.val != b.val mirror(a.left, b.right) and mirror(a.right, b.left), otherwise Why \u0026ldquo;left with left, right with right\u0026rdquo; is wrong here That pattern checks equality, not symmetry.\nThe core difference between 100 and 101 is exactly this:\nLeetCode 100 Same Tree: compare same-direction positions LeetCode 101 Symmetric Tree: compare mirrored positions If you do not switch the direction, you are solving a different problem.\nMethod Category Tree DFS Mirror recursion BFS with paired nodes Structural symmetry checking Why BFS also fits well You can also put mirror pairs into a queue:\nstart with root.left and root.right pop a pair and apply the mirror contract if the pair matches, push: left.left with right.right left.right with right.left This is the same logic as recursion, only written as an explicit process.\nPractice Guide / Steps Recommended Approach: Mirror recursion If root is null, return true Define a helper is_mirror(a, b) Inside the helper, keep the order: both null, one null, value mismatch, recursive mirror checks Return is_mirror(root.left, root.right) Runnable Python example:\nclass TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def is_symmetric(root): def is_mirror(a, b): if a is None and b is None: return True if a is None or b is None: return False if a.val != b.val: return False return is_mirror(a.left, b.right) and is_mirror(a.right, b.left) return True if root is None else is_mirror(root.left, root.right) if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode( 1, TreeNode(2, TreeNode(3), TreeNode(4)), TreeNode(2, TreeNode(4), TreeNode(3)), ) print(is_symmetric(root)) BFS Alternative The non-recursive version is straightforward:\nuse a queue of mirror pairs pop two nodes and compare them together if they match, push the outside pair and the inside pair This style is often easier to debug when you want to print the first non-symmetric pair explicitly.\nE - Engineering (Real-world Scenarios) Scenario 1: Validate mirrored two-column layouts (JavaScript) Background: visual editors often ship left-right mirrored page templates.\nWhy it fits: before publishing a template, you may want to ensure the two sides are strict mirror images to avoid broken placements.\nfunction isMirror(a, b) { if (!a \u0026amp;\u0026amp; !b) return true; if (!a || !b) return false; if (a.type !== b.type) return false; return isMirror(a.left, b.right) \u0026amp;\u0026amp; isMirror(a.right, b.left); } const left = { type: \u0026#34;Split\u0026#34;, left: { type: \u0026#34;Menu\u0026#34; }, right: { type: \u0026#34;Detail\u0026#34; } }; const right = { type: \u0026#34;Split\u0026#34;, left: { type: \u0026#34;Detail\u0026#34; }, right: { type: \u0026#34;Menu\u0026#34; } }; console.log(isMirror(left, right)); Scenario 2: Check active-active topology symmetry (Python) Background: some dual-site deployments require the left and right data-center templates to mirror each other in role and hierarchy.\nWhy it fits: before rollout, symmetry checks can catch missing nodes or role drift on one side.\ndef mirror_role(a, b): if a is None and b is None: return True if a is None or b is None: return False if a[\u0026#34;role\u0026#34;] != b[\u0026#34;role\u0026#34;]: return False return mirror_role(a.get(\u0026#34;left\u0026#34;), b.get(\u0026#34;right\u0026#34;)) and mirror_role(a.get(\u0026#34;right\u0026#34;), b.get(\u0026#34;left\u0026#34;)) left_dc = {\u0026#34;role\u0026#34;: \u0026#34;gateway\u0026#34;, \u0026#34;left\u0026#34;: {\u0026#34;role\u0026#34;: \u0026#34;api\u0026#34;}, \u0026#34;right\u0026#34;: {\u0026#34;role\u0026#34;: \u0026#34;db\u0026#34;}} right_dc = {\u0026#34;role\u0026#34;: \u0026#34;gateway\u0026#34;, \u0026#34;left\u0026#34;: {\u0026#34;role\u0026#34;: \u0026#34;db\u0026#34;}, \u0026#34;right\u0026#34;: {\u0026#34;role\u0026#34;: \u0026#34;api\u0026#34;}} print(mirror_role(left_dc, right_dc)) Scenario 3: Mirror-tree grading in an education tool (Go) Background: algorithm teaching systems sometimes ask students to build a tree that mirrors a target structure.\nWhy it fits: grading must check both node values and mirrored positions, not just a flat traversal result.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Val int Left *Node Right *Node } func mirror(a, b *Node) bool { if a == nil \u0026amp;\u0026amp; b == nil { return true } if a == nil || b == nil { return false } if a.Val != b.Val { return false } return mirror(a.Left, b.Right) \u0026amp;\u0026amp; mirror(a.Right, b.Left) } func main() { left := \u0026amp;Node{Val: 2, Left: \u0026amp;Node{Val: 3}, Right: \u0026amp;Node{Val: 4}} right := \u0026amp;Node{Val: 2, Left: \u0026amp;Node{Val: 4}, Right: \u0026amp;Node{Val: 3}} fmt.Println(mirror(left, right)) } R - Reflection (Analysis and Deeper Understanding) Complexity Analysis Time complexity: O(n), because each node is compared at most once Space complexity: Recursive DFS: O(h), where h is the tree height BFS queue: worst-case O(w), where w is the maximum width of a level Alternative Approaches Method Time Extra Space Notes Mirror recursion O(n) O(h) Most aligned with the definition and usually the best answer BFS with paired nodes O(n) O(w) Explicit flow, easy to instrument for debugging Invert one side then compare O(n) O(h) or O(w) Adds an extra transformation and may mutate the tree Serialize and compare mirror order O(n) O(n) More cumbersome and still needs null markers Common Mistakes and Pitfalls Writing LeetCode 101 as if it were LeetCode 100, still comparing left.left with right.left Comparing only values and ignoring null positions Flipping a subtree first, which is unnecessary and can introduce side effects Storing single nodes in the BFS queue instead of storing comparison pairs Common Questions and Notes 1. Is a single-node tree symmetric? Yes. Its left and right subtrees are both null, so they are mirror images by definition.\n2. Why not invert the left subtree and then compare it with the right subtree? You can, but it is not recommended here. It adds an extra transformation step, makes the reasoning longer, and can modify the original tree.\n3. How should I choose between DFS and BFS? For learning and interviews, recursion is usually the clearest form. If you want explicit logging of failing pairs or want to avoid deep recursion, BFS is a good alternative.\nBest Practices and Suggestions Before coding, draw the outside-pair and inside-pair relation on paper once Memorize the template: left.left with right.right, left.right with right.left Practice 100 and 101 as a pair to build directional intuition quickly If mirror recursion still feels slippery, write the BFS pair queue once to make the pairing explicit S - Summary The core of Symmetric Tree is mirror-position comparison, not traversal by itself Once you keep the outside-inside mirror contract stable, the recursion becomes reliable The BFS version is conceptually identical; it just makes the node pairs explicit This problem pairs naturally with 100 and 226 when building tree-structure intuition In engineering work, the same idea applies to mirrored layouts, mirrored templates, and symmetric topologies References and Further Reading LeetCode 101: Symmetric Tree LeetCode 100: Same Tree LeetCode 226: Invert Binary Tree LeetCode 104: Maximum Depth of Binary Tree LeetCode 102: Binary Tree Level Order Traversal CTA Practice 100, 101, and 226 as a small bundle.\n100 trains same-direction comparison, 101 trains mirror-direction comparison, and 226 trains structural transformation. Together, they build strong binary-tree intuition quickly.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def is_symmetric(root): def is_mirror(a, b): if a is None and b is None: return True if a is None or b is None: return False if a.val != b.val: return False return is_mirror(a.left, b.right) and is_mirror(a.right, b.left) return True if root is None else is_mirror(root.left, root.right) if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode( 1, TreeNode(2, TreeNode(3), TreeNode(4)), TreeNode(2, TreeNode(4), TreeNode(3)), ) print(is_symmetric(root)) #include \u0026lt;stdbool.h\u0026gt; #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct TreeNode { int val; struct TreeNode* left; struct TreeNode* right; }; struct TreeNode* new_node(int val) { struct TreeNode* node = (struct TreeNode*)malloc(sizeof(struct TreeNode)); node-\u0026gt;val = val; node-\u0026gt;left = NULL; node-\u0026gt;right = NULL; return node; } bool isMirror(struct TreeNode* a, struct TreeNode* b) { if (a == NULL \u0026amp;\u0026amp; b == NULL) return true; if (a == NULL || b == NULL) return false; if (a-\u0026gt;val != b-\u0026gt;val) return false; return isMirror(a-\u0026gt;left, b-\u0026gt;right) \u0026amp;\u0026amp; isMirror(a-\u0026gt;right, b-\u0026gt;left); } bool isSymmetric(struct TreeNode* root) { if (root == NULL) return true; return isMirror(root-\u0026gt;left, root-\u0026gt;right); } void free_tree(struct TreeNode* root) { if (!root) return; free_tree(root-\u0026gt;left); free_tree(root-\u0026gt;right); free(root); } int main(void) { struct TreeNode* root = new_node(1); root-\u0026gt;left = new_node(2); root-\u0026gt;right = new_node(2); root-\u0026gt;left-\u0026gt;left = new_node(3); root-\u0026gt;left-\u0026gt;right = new_node(4); root-\u0026gt;right-\u0026gt;left = new_node(4); root-\u0026gt;right-\u0026gt;right = new_node(3); printf(\u0026#34;%s\\n\u0026#34;, isSymmetric(root) ? \u0026#34;true\u0026#34; : \u0026#34;false\u0026#34;); free_tree(root); return 0; } #include \u0026lt;iostream\u0026gt; struct TreeNode { int val; TreeNode* left; TreeNode* right; explicit TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} }; bool isMirror(TreeNode* a, TreeNode* b) { if (!a \u0026amp;\u0026amp; !b) return true; if (!a || !b) return false; if (a-\u0026gt;val != b-\u0026gt;val) return false; return isMirror(a-\u0026gt;left, b-\u0026gt;right) \u0026amp;\u0026amp; isMirror(a-\u0026gt;right, b-\u0026gt;left); } bool isSymmetric(TreeNode* root) { if (!root) return true; return isMirror(root-\u0026gt;left, root-\u0026gt;right); } void freeTree(TreeNode* root) { if (!root) return; freeTree(root-\u0026gt;left); freeTree(root-\u0026gt;right); delete root; } int main() { TreeNode* root = new TreeNode(1); root-\u0026gt;left = new TreeNode(2); root-\u0026gt;right = new TreeNode(2); root-\u0026gt;left-\u0026gt;left = new TreeNode(3); root-\u0026gt;left-\u0026gt;right = new TreeNode(4); root-\u0026gt;right-\u0026gt;left = new TreeNode(4); root-\u0026gt;right-\u0026gt;right = new TreeNode(3); std::cout \u0026lt;\u0026lt; (isSymmetric(root) ? \u0026#34;true\u0026#34; : \u0026#34;false\u0026#34;) \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; freeTree(root); return 0; } package main import \u0026#34;fmt\u0026#34; type TreeNode struct { Val int Left *TreeNode Right *TreeNode } func isMirror(a *TreeNode, b *TreeNode) bool { if a == nil \u0026amp;\u0026amp; b == nil { return true } if a == nil || b == nil { return false } if a.Val != b.Val { return false } return isMirror(a.Left, b.Right) \u0026amp;\u0026amp; isMirror(a.Right, b.Left) } func isSymmetric(root *TreeNode) bool { if root == nil { return true } return isMirror(root.Left, root.Right) } func main() { root := \u0026amp;TreeNode{ Val: 1, Left: \u0026amp;TreeNode{ Val: 2, Left: \u0026amp;TreeNode{Val: 3}, Right: \u0026amp;TreeNode{Val: 4}, }, Right: \u0026amp;TreeNode{ Val: 2, Left: \u0026amp;TreeNode{Val: 4}, Right: \u0026amp;TreeNode{Val: 3}, }, } fmt.Println(isSymmetric(root)) } use std::cell::RefCell; use std::rc::Rc; type Node = Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;TreeNode\u0026gt;\u0026gt;\u0026gt;; #[derive(Debug, Clone)] struct TreeNode { val: i32, left: Node, right: Node, } impl TreeNode { fn new(val: i32) -\u0026gt; Rc\u0026lt;RefCell\u0026lt;TreeNode\u0026gt;\u0026gt; { Rc::new(RefCell::new(TreeNode { val, left: None, right: None, })) } } fn is_mirror(a: \u0026amp;Node, b: \u0026amp;Node) -\u0026gt; bool { match (a, b) { (None, None) =\u0026gt; true, (Some(x), Some(y)) =\u0026gt; { let xr = x.borrow(); let yr = y.borrow(); xr.val == yr.val \u0026amp;\u0026amp; is_mirror(\u0026amp;xr.left, \u0026amp;yr.right) \u0026amp;\u0026amp; is_mirror(\u0026amp;xr.right, \u0026amp;yr.left) } _ =\u0026gt; false, } } fn is_symmetric(root: \u0026amp;Node) -\u0026gt; bool { match root { None =\u0026gt; true, Some(node) =\u0026gt; { let node_ref = node.borrow(); is_mirror(\u0026amp;node_ref.left, \u0026amp;node_ref.right) } } } fn main() { let root = Some(TreeNode::new(1)); if let Some(node) = \u0026amp;root { let left = Some(TreeNode::new(2)); let right = Some(TreeNode::new(2)); node.borrow_mut().left = left.clone(); node.borrow_mut().right = right.clone(); if let Some(l) = left { l.borrow_mut().left = Some(TreeNode::new(3)); l.borrow_mut().right = Some(TreeNode::new(4)); } if let Some(r) = right { r.borrow_mut().left = Some(TreeNode::new(4)); r.borrow_mut().right = Some(TreeNode::new(3)); } } println!(\u0026#34;{}\u0026#34;, is_symmetric(\u0026amp;root)); } function TreeNode(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function isMirror(a, b) { if (a === null \u0026amp;\u0026amp; b === null) return true; if (a === null || b === null) return false; if (a.val !== b.val) return false; return isMirror(a.left, b.right) \u0026amp;\u0026amp; isMirror(a.right, b.left); } function isSymmetric(root) { if (root === null) return true; return isMirror(root.left, root.right); } const root = new TreeNode( 1, new TreeNode(2, new TreeNode(3), new TreeNode(4)), new TreeNode(2, new TreeNode(4), new TreeNode(3)), ); console.log(isSymmetric(root)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/binary-tree/101-symmetric-tree/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThe hard part of Symmetric Tree is not traversal itself, but comparison direction. You are not comparing left to left and right to right. You are comparing mirrored positions. This ACERS guide explains the mirror-recursion contract, the BFS queue-of-pairs variant, and real engineering cases where symmetry checking matters.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebinary tree\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e, \u003ccode\u003eBFS\u003c/code\u003e, \u003ccode\u003esymmetry\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Symmetric Tree, mirror recursion, binary tree symmetry, BFS, LeetCode 101\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A systematic guide to LeetCode 101 from mirror recursion to pairwise BFS symmetry checks, with engineering scenarios and runnable multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners moving from Same Tree to mirror comparison\u003c/li\u003e\n\u003cli\u003eDevelopers who can write ordinary tree recursion but still mix up outside and inside pairs\u003c/li\u003e\n\u003cli\u003eEngineers who need left-right symmetry validation for layouts, topology templates, or mirrored structures\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eLeetCode 101 is excellent training for directional thinking in tree problems:\u003c/p\u003e","title":"Hot100: Symmetric Tree (Mirror Recursion / BFS ACERS Guide)"},{"content":" Subtitle / Summary\nThe real challenge in LeetCode 100 is not \u0026ldquo;can you traverse a tree\u0026rdquo;, but \u0026ldquo;can you compare two trees node by node in lockstep\u0026rdquo;. This ACERS guide explains the synchronous-recursion contract, the queue-of-pairs BFS variant, and why the pattern matters in real engineering work.\nReading time: 9-11 min Tags: Hot100, binary tree, DFS, BFS, tree comparison SEO keywords: Hot100, Same Tree, binary tree comparison, synchronous recursion, BFS, LeetCode 100 Meta description: A systematic guide to LeetCode 100 from synchronous recursion to pairwise BFS comparison, with engineering scenarios and runnable multi-language implementations. Target Readers Hot100 learners who want to build a stable \u0026ldquo;compare two trees together\u0026rdquo; template Developers who can write DFS on one tree but get confused once two trees must be checked in parallel Engineers who need structural-equivalence checks for config trees, component trees, or syntax trees Background / Motivation When many people first see LeetCode 100, the instinct is:\ntraverse tree p traverse tree q compare the two traversal results afterward That can work only if you serialize very carefully, but it is not the core idea of the problem.\nThe real training value is:\ncan you pull out matching nodes from p and q at the same time can you turn \u0026ldquo;same tree\u0026rdquo; into a precise decision contract can you handle null cases before touching node values This matters far beyond one easy problem. The same pattern reappears in:\nsubtree checking mirror and symmetry checking structural comparison between two tree-shaped configurations So although LeetCode 100 is simple, it is the starting point of the two-tree synchronous comparison template.\nCore Concepts Synchronous recursion: the recursive function takes a pair of nodes (p, q), not just one node Structural equality: matching positions must either both be null or both exist Value equality: if both nodes exist, their values must match Pairwise traversal: whether you use DFS or BFS, the unit of work is always a pair of nodes A - Algorithm (Problem and Algorithm) Problem Restatement Given the root nodes p and q of two binary trees, return true if the trees are the same.\nTwo trees are considered the same if:\nthey have exactly the same structure corresponding nodes have the same values Input / Output Name Type Description p TreeNode root of the first binary tree q TreeNode root of the second binary tree return bool whether the two trees are exactly the same Example 1 input: p = [1,2,3], q = [1,2,3] output: true explanation: The structures match, and every corresponding node value is equal. Example 2 input: p = [1,2], q = [1,null,2] output: false explanation: At the second level, one tree has a left child while the other has a right child. The structure is different. Example 3 input: p = [1,2,1], q = [1,1,2] output: false explanation: The structure matches, but the values at corresponding positions do not. Constraints The number of nodes in both trees is in the range [0, 100] -10^4 \u0026lt;= Node.val \u0026lt;= 10^4 C - Concepts (Core Ideas) Thought Process: Break \u0026ldquo;same\u0026rdquo; into four decision rules For any pair of nodes (p, q), the answer is determined by four stable checks:\nBoth are null: this position matches, so return true Exactly one is null: the structure already differs, so return false Values differ: corresponding nodes are not equal, so return false Both exist and values match: continue checking p.left with q.left p.right with q.right Written as a formula:\nsame(p, q) = true, if p == null and q == null false, if exactly one is null false, if p.val != q.val same(p.left, q.left) and same(p.right, q.right), otherwise Why this is already the complete answer The definition of \u0026ldquo;same tree\u0026rdquo; has only two pieces:\nsame structure same values Every recursive call checks whether the current paired position satisfies those two requirements, then reduces the problem to the left children and right children. That is the classic pattern of local contract + same-shaped subproblem.\nMethod Category Tree DFS Synchronous recursion Pairwise BFS validation Structural equivalence checking Why BFS also works If you do not want recursion, store pairs in a queue:\npop one pair of nodes apply the same four rules as above if the current pair matches, push (left, left) and (right, right) back into the queue Nothing changes conceptually. Only the execution model changes from the call stack to an explicit queue.\nPractice Guide / Steps Recommended Approach: Synchronous recursion Define a function is_same_tree(p, q) Handle the null combinations first Compare the current node values Recursively compare left-left and right-right Runnable Python example:\nclass TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def is_same_tree(p, q): if p is None and q is None: return True if p is None or q is None: return False if p.val != q.val: return False return is_same_tree(p.left, q.left) and is_same_tree(p.right, q.right) if __name__ == \u0026#34;__main__\u0026#34;: a = TreeNode(1, TreeNode(2), TreeNode(3)) b = TreeNode(1, TreeNode(2), TreeNode(3)) print(is_same_tree(a, b)) BFS Alternative If you prefer an explicit control flow or want to avoid recursion depth concerns:\nkeep (p, q) pairs in a queue pop one pair at a time and compare it if the pair is valid, push their children in matching directions This style is also convenient when you want to log exactly which pair first fails.\nE - Engineering (Real-world Scenarios) Scenario 1: Detect config-tree drift before release (Python) Background: feature flags, permission inheritance, and routing rules are often stored as tree-shaped nested configs.\nWhy it fits: before a rollout, teams may need to confirm that the staging config tree is exactly the same as production.\ndef same_config(a, b): if a is None and b is None: return True if a is None or b is None: return False if a[\u0026#34;name\u0026#34;] != b[\u0026#34;name\u0026#34;]: return False return same_config(a.get(\u0026#34;left\u0026#34;), b.get(\u0026#34;left\u0026#34;)) and same_config(a.get(\u0026#34;right\u0026#34;), b.get(\u0026#34;right\u0026#34;)) cfg1 = {\u0026#34;name\u0026#34;: \u0026#34;root\u0026#34;, \u0026#34;left\u0026#34;: {\u0026#34;name\u0026#34;: \u0026#34;A\u0026#34;}, \u0026#34;right\u0026#34;: {\u0026#34;name\u0026#34;: \u0026#34;B\u0026#34;}} cfg2 = {\u0026#34;name\u0026#34;: \u0026#34;root\u0026#34;, \u0026#34;left\u0026#34;: {\u0026#34;name\u0026#34;: \u0026#34;A\u0026#34;}, \u0026#34;right\u0026#34;: {\u0026#34;name\u0026#34;: \u0026#34;B\u0026#34;}} print(same_config(cfg1, cfg2)) Scenario 2: Check component-tree snapshot equivalence (JavaScript) Background: low-code editors and page builders often save UI layouts as component trees.\nWhy it fits: regression checks need to confirm that two snapshots match in both node type and parent-child structure.\nfunction sameTree(a, b) { if (!a \u0026amp;\u0026amp; !b) return true; if (!a || !b) return false; if (a.type !== b.type) return false; return sameTree(a.left, b.left) \u0026amp;\u0026amp; sameTree(a.right, b.right); } const oldTree = { type: \u0026#34;Split\u0026#34;, left: { type: \u0026#34;List\u0026#34; }, right: { type: \u0026#34;Form\u0026#34; } }; const newTree = { type: \u0026#34;Split\u0026#34;, left: { type: \u0026#34;List\u0026#34; }, right: { type: \u0026#34;Form\u0026#34; } }; console.log(sameTree(oldTree, newTree)); Scenario 3: Validate AST structure after a rewrite pass (Go) Background: compilers, linters, and rule engines often rewrite syntax trees.\nWhy it fits: after a rewrite, you may want to verify that the resulting tree shape and labels match the expected output exactly.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Label string Left *Node Right *Node } func same(a, b *Node) bool { if a == nil \u0026amp;\u0026amp; b == nil { return true } if a == nil || b == nil { return false } if a.Label != b.Label { return false } return same(a.Left, b.Left) \u0026amp;\u0026amp; same(a.Right, b.Right) } func main() { x := \u0026amp;Node{Label: \u0026#34;Add\u0026#34;, Left: \u0026amp;Node{Label: \u0026#34;A\u0026#34;}, Right: \u0026amp;Node{Label: \u0026#34;B\u0026#34;}} y := \u0026amp;Node{Label: \u0026#34;Add\u0026#34;, Left: \u0026amp;Node{Label: \u0026#34;A\u0026#34;}, Right: \u0026amp;Node{Label: \u0026#34;B\u0026#34;}} fmt.Println(same(x, y)) } R - Reflection (Analysis and Deeper Understanding) Complexity Analysis Time complexity: O(n), where n is the number of compared paired positions; in the worst case, every corresponding node is visited Space complexity: Recursive DFS: O(h), where h is tree height BFS queue: worst-case O(w), where w is the maximum width of a level Alternative Approaches Method Time Extra Space Notes Synchronous recursion O(n) O(h) Most aligned with the definition and usually the best answer BFS with node pairs O(n) O(w) Non-recursive and easy to debug Serialize then compare O(n) O(n) Must encode null positions or it can give false positives Hash-signature comparison Depends on design Extra hash storage Useful as a quick filter in some systems, but less direct than plain comparison Common Mistakes and Pitfalls Comparing only preorder or inorder values without encoding null positions Accidentally writing same(p.left, q.right), which is the mirror template for LeetCode 101, not this problem Touching p.val before checking whether p or q is null Confusing \u0026ldquo;same value\u0026rdquo; with \u0026ldquo;same object identity in memory\u0026rdquo; Common Questions and Notes 1. Why can\u0026rsquo;t we just compare traversal results? Because different structures can produce the same value sequence.\nIf you truly serialize, you must include null markers as well.\n2. Does the problem ask whether p and q are the same object? No. The question is about same structure + same values, not whether the two roots point to the same memory.\n3. Which is more recommended, DFS or BFS? For interviews and learning, recursion is shorter and closer to the definition. In engineering work, BFS can be more convenient if you want to log the failing comparison path or avoid deep recursion.\nBest Practices and Suggestions For two-tree problems, ask yourself first whether the recursive function should take two nodes Put null checks at the very top; it removes a large class of bugs Keep the concepts separate: same value, same structure, and same reference are different things When you see 100, 101, or 572, think \u0026ldquo;paired comparison template\u0026rdquo; immediately S - Summary The essence of LeetCode 100 is paired comparison, not separate traversal Once you keep the four decision rules stable, synchronous recursion becomes very reliable The BFS version is the same idea with an explicit queue of pairs Structural equivalence checks show up directly in config trees, component trees, and syntax trees If you can write 100 smoothly, LeetCode 101 becomes much easier next References and Further Reading LeetCode 100: Same Tree LeetCode 101: Symmetric Tree LeetCode 572: Subtree of Another Tree LeetCode 226: Invert Binary Tree LeetCode 102: Binary Tree Level Order Traversal CTA Practice 100 and 101 back to back.\n100 trains same-direction comparison, while 101 trains mirror-direction comparison. Once those two templates are clear, binary-tree judgment problems become much easier to reason about.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def is_same_tree(p, q): if p is None and q is None: return True if p is None or q is None: return False if p.val != q.val: return False return is_same_tree(p.left, q.left) and is_same_tree(p.right, q.right) if __name__ == \u0026#34;__main__\u0026#34;: p = TreeNode(1, TreeNode(2), TreeNode(3)) q = TreeNode(1, TreeNode(2), TreeNode(3)) print(is_same_tree(p, q)) #include \u0026lt;stdbool.h\u0026gt; #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct TreeNode { int val; struct TreeNode* left; struct TreeNode* right; }; struct TreeNode* new_node(int val) { struct TreeNode* node = (struct TreeNode*)malloc(sizeof(struct TreeNode)); node-\u0026gt;val = val; node-\u0026gt;left = NULL; node-\u0026gt;right = NULL; return node; } bool isSameTree(struct TreeNode* p, struct TreeNode* q) { if (p == NULL \u0026amp;\u0026amp; q == NULL) return true; if (p == NULL || q == NULL) return false; if (p-\u0026gt;val != q-\u0026gt;val) return false; return isSameTree(p-\u0026gt;left, q-\u0026gt;left) \u0026amp;\u0026amp; isSameTree(p-\u0026gt;right, q-\u0026gt;right); } void free_tree(struct TreeNode* root) { if (!root) return; free_tree(root-\u0026gt;left); free_tree(root-\u0026gt;right); free(root); } int main(void) { struct TreeNode* p = new_node(1); p-\u0026gt;left = new_node(2); p-\u0026gt;right = new_node(3); struct TreeNode* q = new_node(1); q-\u0026gt;left = new_node(2); q-\u0026gt;right = new_node(3); printf(\u0026#34;%s\\n\u0026#34;, isSameTree(p, q) ? \u0026#34;true\u0026#34; : \u0026#34;false\u0026#34;); free_tree(p); free_tree(q); return 0; } #include \u0026lt;iostream\u0026gt; struct TreeNode { int val; TreeNode* left; TreeNode* right; explicit TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} }; bool isSameTree(TreeNode* p, TreeNode* q) { if (!p \u0026amp;\u0026amp; !q) return true; if (!p || !q) return false; if (p-\u0026gt;val != q-\u0026gt;val) return false; return isSameTree(p-\u0026gt;left, q-\u0026gt;left) \u0026amp;\u0026amp; isSameTree(p-\u0026gt;right, q-\u0026gt;right); } void freeTree(TreeNode* root) { if (!root) return; freeTree(root-\u0026gt;left); freeTree(root-\u0026gt;right); delete root; } int main() { TreeNode* p = new TreeNode(1); p-\u0026gt;left = new TreeNode(2); p-\u0026gt;right = new TreeNode(3); TreeNode* q = new TreeNode(1); q-\u0026gt;left = new TreeNode(2); q-\u0026gt;right = new TreeNode(3); std::cout \u0026lt;\u0026lt; (isSameTree(p, q) ? \u0026#34;true\u0026#34; : \u0026#34;false\u0026#34;) \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; freeTree(p); freeTree(q); return 0; } package main import \u0026#34;fmt\u0026#34; type TreeNode struct { Val int Left *TreeNode Right *TreeNode } func isSameTree(p *TreeNode, q *TreeNode) bool { if p == nil \u0026amp;\u0026amp; q == nil { return true } if p == nil || q == nil { return false } if p.Val != q.Val { return false } return isSameTree(p.Left, q.Left) \u0026amp;\u0026amp; isSameTree(p.Right, q.Right) } func main() { p := \u0026amp;TreeNode{Val: 1, Left: \u0026amp;TreeNode{Val: 2}, Right: \u0026amp;TreeNode{Val: 3}} q := \u0026amp;TreeNode{Val: 1, Left: \u0026amp;TreeNode{Val: 2}, Right: \u0026amp;TreeNode{Val: 3}} fmt.Println(isSameTree(p, q)) } use std::cell::RefCell; use std::rc::Rc; type Node = Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;TreeNode\u0026gt;\u0026gt;\u0026gt;; #[derive(Debug, Clone)] struct TreeNode { val: i32, left: Node, right: Node, } impl TreeNode { fn new(val: i32) -\u0026gt; Rc\u0026lt;RefCell\u0026lt;TreeNode\u0026gt;\u0026gt; { Rc::new(RefCell::new(TreeNode { val, left: None, right: None, })) } } fn is_same_tree(p: \u0026amp;Node, q: \u0026amp;Node) -\u0026gt; bool { match (p, q) { (None, None) =\u0026gt; true, (Some(a), Some(b)) =\u0026gt; { let a_ref = a.borrow(); let b_ref = b.borrow(); a_ref.val == b_ref.val \u0026amp;\u0026amp; is_same_tree(\u0026amp;a_ref.left, \u0026amp;b_ref.left) \u0026amp;\u0026amp; is_same_tree(\u0026amp;a_ref.right, \u0026amp;b_ref.right) } _ =\u0026gt; false, } } fn main() { let p = Some(TreeNode::new(1)); let q = Some(TreeNode::new(1)); if let Some(root) = \u0026amp;p { root.borrow_mut().left = Some(TreeNode::new(2)); root.borrow_mut().right = Some(TreeNode::new(3)); } if let Some(root) = \u0026amp;q { root.borrow_mut().left = Some(TreeNode::new(2)); root.borrow_mut().right = Some(TreeNode::new(3)); } println!(\u0026#34;{}\u0026#34;, is_same_tree(\u0026amp;p, \u0026amp;q)); } function TreeNode(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function isSameTree(p, q) { if (p === null \u0026amp;\u0026amp; q === null) return true; if (p === null || q === null) return false; if (p.val !== q.val) return false; return isSameTree(p.left, q.left) \u0026amp;\u0026amp; isSameTree(p.right, q.right); } const p = new TreeNode(1, new TreeNode(2), new TreeNode(3)); const q = new TreeNode(1, new TreeNode(2), new TreeNode(3)); console.log(isSameTree(p, q)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/binary-tree/100-same-tree/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThe real challenge in LeetCode 100 is not \u0026ldquo;can you traverse a tree\u0026rdquo;, but \u0026ldquo;can you compare two trees node by node in lockstep\u0026rdquo;. This ACERS guide explains the synchronous-recursion contract, the queue-of-pairs BFS variant, and why the pattern matters in real engineering work.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 9-11 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebinary tree\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e, \u003ccode\u003eBFS\u003c/code\u003e, \u003ccode\u003etree comparison\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Same Tree, binary tree comparison, synchronous recursion, BFS, LeetCode 100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A systematic guide to LeetCode 100 from synchronous recursion to pairwise BFS comparison, with engineering scenarios and runnable multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want to build a stable \u0026ldquo;compare two trees together\u0026rdquo; template\u003c/li\u003e\n\u003cli\u003eDevelopers who can write DFS on one tree but get confused once two trees must be checked in parallel\u003c/li\u003e\n\u003cli\u003eEngineers who need structural-equivalence checks for config trees, component trees, or syntax trees\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eWhen many people first see LeetCode 100, the instinct is:\u003c/p\u003e","title":"Hot100: Same Tree (Synchronous Recursion / BFS ACERS Guide)"},{"content":" Subtitle / Summary\nInvert Binary Tree looks tiny, but it is one of the fastest ways to test whether you really understand recursive structure on trees. This guide uses LeetCode 226 to break down the essence of \u0026ldquo;swap left and right subtrees\u0026rdquo;, covers both recursion and BFS, and shows how the same idea transfers to engineering scenarios.\nReading time: 8-10 min Tags: Hot100, binary tree, recursion, BFS, tree transformation SEO keywords: Hot100, Invert Binary Tree, tree mirror, recursion, BFS, LeetCode 226 Meta description: Learn the recursive and BFS solutions for LeetCode 226, then extend the idea to layout mirroring and structural transformations. Target Readers Hot100 learners who want to verify whether they truly understand \u0026ldquo;apply recursion to every node in the whole tree\u0026rdquo; Developers who instinctively start traversing any tree problem, but are unsure when to process the current node Engineers who need tree mirroring, layout inversion, or symmetric structural transforms Background / Motivation The code for LeetCode 226 is usually very short, but the thinking pattern is extremely typical:\nWhat should the current node do?\nSwap left and right.\nWhat is the subproblem?\nThe left and right subtrees must also be inverted.\nThis is a very pure example of current operation + recursive handling of identical subproblems.\nIf you do not fully internalize this problem, you often end up with mistakes like:\nswapping only the root and forgetting the subtrees getting the recursive direction mixed up after the swap rebuilding a brand new tree for something that can be done in place Core Concepts Tree mirror: swap the left and right subtree of every node In-place transform: do not rebuild the whole tree; only swap pointers or references Recursive divide-and-conquer: after handling the current node, each subtree is still the same kind of problem BFS level-order transform: you can also swap each node\u0026rsquo;s children level by level A - Algorithm (Problem and Algorithm) Problem Restatement Given the root node root of a binary tree, invert the tree and return the root of the inverted tree.\nInput / Output Name Type Description root TreeNode root of the binary tree, may be null return TreeNode root of the inverted tree Example 1 input: root = [4,2,7,1,3,6,9] output: [4,7,2,9,6,3,1] explanation: after swapping the left and right subtrees across the whole tree, every node is mirrored. Example 2 input: root = [2,1,3] output: [2,3,1] Example 3 input: root = [] output: [] Constraints The number of nodes is in the range [0, 100] -100 \u0026lt;= Node.val \u0026lt;= 100 C - Concepts (Core Ideas) Thought Process: Why \u0026ldquo;swap + recursion\u0026rdquo; is enough Suppose the current node is node. We only need two steps:\nSwap node.left and node.right Recursively invert the new left subtree and the new right subtree The pseudocode is very short:\ninvert(node): if node is null: return null swap node.left and node.right invert(node.left) invert(node.right) return node Why this is the complete answer Because inverting the whole tree is essentially \u0026ldquo;perform one left-right swap on every node\u0026rdquo;.\nAnd every local subtree is still a tree, so recursion fits naturally.\nMethod Category Tree recursion In-place structural transform BFS / queue traversal Recursion vs BFS Recursion\nshortest code matches the recursive definition of a tree recommended as the main solution BFS\nswap nodes level by level useful if you also want to do level statistics or visualization Practice Guide / Steps Recommended Approach: Recursion Handle the null case Swap the left and right child Recursively invert the left subtree Recursively invert the right subtree Return the current node Runnable Python example:\nfrom collections import deque class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def invert_tree(root): if root is None: return None root.left, root.right = root.right, root.left invert_tree(root.left) invert_tree(root.right) return root def level_order(root): if root is None: return [] q = deque([root]) res = [] while q: node = q.popleft() res.append(node.val) if node.left: q.append(node.left) if node.right: q.append(node.right) return res if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(4, TreeNode(2, TreeNode(1), TreeNode(3)), TreeNode(7, TreeNode(6), TreeNode(9))) invert_tree(root) print(level_order(root)) E - Engineering (Real-world Scenarios) Scenario 1: Mirrored split-pane layout preview (JavaScript) Background: visual editors often store split-pane layouts as binary trees.\nWhy it fits: a \u0026ldquo;mirror preview\u0026rdquo; is essentially swapping the left and right regions at every split node.\nfunction Pane(name, left = null, right = null) { this.name = name; this.left = left; this.right = right; } function invert(node) { if (!node) return null; [node.left, node.right] = [node.right, node.left]; invert(node.left); invert(node.right); return node; } const root = new Pane(\u0026#34;root\u0026#34;, new Pane(\u0026#34;left\u0026#34;), new Pane(\u0026#34;right\u0026#34;)); console.log(invert(root)); Scenario 2: Tree-mirroring demos in teaching tools (Python) Background: algorithm teaching platforms often need dynamic demonstrations of the \u0026ldquo;mirror\u0026rdquo; concept.\nWhy it fits: the solution to LeetCode 226 is the standard tree-mirroring transform.\nclass Node: def __init__(self, val, left=None, right=None): self.val = val self.left = left self.right = right def invert(node): if node is None: return None node.left, node.right = invert(node.right), invert(node.left) return node root = Node(\u0026#34;A\u0026#34;, Node(\u0026#34;B\u0026#34;), Node(\u0026#34;C\u0026#34;)) print(invert(root).left.val) Scenario 3: Left/right branch inversion tests for rule trees (Go) Background: some rule engines organize \u0026ldquo;match / no-match\u0026rdquo; branches as binary trees.\nWhy it fits: when doing mirror-based tests, this quickly verifies whether the left/right branch logic is symmetric.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Name string Left *Node Right *Node } func invert(node *Node) *Node { if node == nil { return nil } node.Left, node.Right = invert(node.Right), invert(node.Left) return node } func main() { root := \u0026amp;Node{\u0026#34;root\u0026#34;, \u0026amp;Node{\u0026#34;allow\u0026#34;, nil, nil}, \u0026amp;Node{\u0026#34;deny\u0026#34;, nil, nil}} root = invert(root) fmt.Println(root.Left.Name, root.Right.Name) } R - Reflection (Analysis and Deeper Understanding) Complexity Analysis Time complexity: O(n), because each node is swapped once Space complexity: Recursion: O(h) BFS: O(w), where w is the maximum width of the tree Alternative Approaches Method Time Extra Space Notes Recursion O(n) O(h) Simplest and recommended BFS queue O(n) O(w) Convenient for level-by-level processing Build a new mirror tree O(n) O(n) Unnecessary extra memory allocation Common Mistakes and Pitfalls Swapping only the root once and forgetting to recurse into the subtrees Swapping first, then recursing through stale references and confusing the logic Rebuilding a whole new tree even though the job can be done in place Confusing \u0026ldquo;invert binary tree\u0026rdquo; with \u0026ldquo;reverse linked list\u0026rdquo; and mistakenly thinking a linear reconnection order is needed Common Questions and Notes 1. Can I recurse first and swap later? Yes. As long as every node eventually completes the left-right swap, the result is correct.\nBut \u0026ldquo;swap first, recurse later\u0026rdquo; is usually the most intuitive version.\n2. Which is better in interviews, recursion or BFS? Recursion is the default choice for this problem. BFS is better treated as an alternative implementation that shows you understand traversal variations.\n3. Is this preorder or postorder? It is closer to a preorder-style action: the current node swaps immediately, and only then do we process the subtrees.\nBest Practices and Suggestions For tree-transform problems, first ask \u0026ldquo;what changes at the current node?\u0026rdquo;, then ask \u0026ldquo;is each subtree the same type of subproblem?\u0026rdquo; If you can swap in place, do that and avoid unnecessary object allocation Keep the recursive function semantic simple: input a tree, return the inverted version of that same tree Do not just memorize the code; be able to explain verbally why recursion is such a natural fit here S - Summary The essence of LeetCode 226 is one left-right swap at every node Recursion works because every subtree is still the same problem This is a classic template of \u0026ldquo;process current node + recurse on subproblems\u0026rdquo; BFS also works, but recursion expresses the idea more directly In engineering, the same pattern applies to layout mirroring, visualization mirroring, and symmetry tests on rule trees References and Further Reading LeetCode 226: Invert Binary Tree LeetCode 101: Symmetric Tree LeetCode 100: Same Tree LeetCode 104: Maximum Depth of Binary Tree LeetCode 102: Binary Tree Level Order Traversal CTA Try solving 226, 101, and 100 together.\nOne trains structural transformation, one trains structural comparison, and together they make tree recursion feel much more solid.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from collections import deque class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def invert_tree(root): if root is None: return None root.left, root.right = root.right, root.left invert_tree(root.left) invert_tree(root.right) return root def level_order(root): if root is None: return [] q = deque([root]) res = [] while q: node = q.popleft() res.append(node.val) if node.left: q.append(node.left) if node.right: q.append(node.right) return res if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(4, TreeNode(2, TreeNode(1), TreeNode(3)), TreeNode(7, TreeNode(6), TreeNode(9))) invert_tree(root) print(level_order(root)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct TreeNode { int val; struct TreeNode* left; struct TreeNode* right; }; struct TreeNode* new_node(int val) { struct TreeNode* node = (struct TreeNode*)malloc(sizeof(struct TreeNode)); node-\u0026gt;val = val; node-\u0026gt;left = NULL; node-\u0026gt;right = NULL; return node; } struct TreeNode* invertTree(struct TreeNode* root) { if (root == NULL) return NULL; struct TreeNode* tmp = root-\u0026gt;left; root-\u0026gt;left = invertTree(root-\u0026gt;right); root-\u0026gt;right = invertTree(tmp); return root; } void preorder(struct TreeNode* root) { if (!root) return; printf(\u0026#34;%d \u0026#34;, root-\u0026gt;val); preorder(root-\u0026gt;left); preorder(root-\u0026gt;right); } void free_tree(struct TreeNode* root) { if (!root) return; free_tree(root-\u0026gt;left); free_tree(root-\u0026gt;right); free(root); } int main(void) { struct TreeNode* root = new_node(4); root-\u0026gt;left = new_node(2); root-\u0026gt;right = new_node(7); root-\u0026gt;left-\u0026gt;left = new_node(1); root-\u0026gt;left-\u0026gt;right = new_node(3); root-\u0026gt;right-\u0026gt;left = new_node(6); root-\u0026gt;right-\u0026gt;right = new_node(9); invertTree(root); preorder(root); printf(\u0026#34;\\n\u0026#34;); free_tree(root); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;utility\u0026gt; struct TreeNode { int val; TreeNode* left; TreeNode* right; explicit TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} }; TreeNode* invertTree(TreeNode* root) { if (!root) return nullptr; std::swap(root-\u0026gt;left, root-\u0026gt;right); invertTree(root-\u0026gt;left); invertTree(root-\u0026gt;right); return root; } void preorder(TreeNode* root) { if (!root) return; std::cout \u0026lt;\u0026lt; root-\u0026gt;val \u0026lt;\u0026lt; \u0026#39; \u0026#39;; preorder(root-\u0026gt;left); preorder(root-\u0026gt;right); } void freeTree(TreeNode* root) { if (!root) return; freeTree(root-\u0026gt;left); freeTree(root-\u0026gt;right); delete root; } int main() { TreeNode* root = new TreeNode(4); root-\u0026gt;left = new TreeNode(2); root-\u0026gt;right = new TreeNode(7); root-\u0026gt;left-\u0026gt;left = new TreeNode(1); root-\u0026gt;left-\u0026gt;right = new TreeNode(3); root-\u0026gt;right-\u0026gt;left = new TreeNode(6); root-\u0026gt;right-\u0026gt;right = new TreeNode(9); invertTree(root); preorder(root); std::cout \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; freeTree(root); return 0; } package main import \u0026#34;fmt\u0026#34; type TreeNode struct { Val int Left *TreeNode Right *TreeNode } func invertTree(root *TreeNode) *TreeNode { if root == nil { return nil } root.Left, root.Right = invertTree(root.Right), invertTree(root.Left) return root } func preorder(root *TreeNode) { if root == nil { return } fmt.Print(root.Val, \u0026#34; \u0026#34;) preorder(root.Left) preorder(root.Right) } func main() { root := \u0026amp;TreeNode{ Val: 4, Left: \u0026amp;TreeNode{ Val: 2, Left: \u0026amp;TreeNode{Val: 1}, Right: \u0026amp;TreeNode{Val: 3}, }, Right: \u0026amp;TreeNode{ Val: 7, Left: \u0026amp;TreeNode{Val: 6}, Right: \u0026amp;TreeNode{Val: 9}, }, } invertTree(root) preorder(root) fmt.Println() } #[derive(Debug)] struct TreeNode { val: i32, left: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, right: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, } fn invert_tree(root: \u0026amp;mut Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;) { if let Some(node) = root { std::mem::swap(\u0026amp;mut node.left, \u0026amp;mut node.right); invert_tree(\u0026amp;mut node.left); invert_tree(\u0026amp;mut node.right); } } fn preorder(root: \u0026amp;Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;) { if let Some(node) = root { print!(\u0026#34;{} \u0026#34;, node.val); preorder(\u0026amp;node.left); preorder(\u0026amp;node.right); } } fn main() { let mut root = Some(Box::new(TreeNode { val: 4, left: Some(Box::new(TreeNode { val: 2, left: Some(Box::new(TreeNode { val: 1, left: None, right: None, })), right: Some(Box::new(TreeNode { val: 3, left: None, right: None, })), })), right: Some(Box::new(TreeNode { val: 7, left: Some(Box::new(TreeNode { val: 6, left: None, right: None, })), right: Some(Box::new(TreeNode { val: 9, left: None, right: None, })), })), })); invert_tree(\u0026amp;mut root); preorder(\u0026amp;root); println!(); } function TreeNode(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function invertTree(root) { if (!root) return null; [root.left, root.right] = [invertTree(root.right), invertTree(root.left)]; return root; } function preorder(root, out = []) { if (!root) return out; out.push(root.val); preorder(root.left, out); preorder(root.right, out); return out; } const root = new TreeNode( 4, new TreeNode(2, new TreeNode(1), new TreeNode(3)), new TreeNode(7, new TreeNode(6), new TreeNode(9)) ); invertTree(root); console.log(preorder(root)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/binary-tree/226-invert-binary-tree/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nInvert Binary Tree looks tiny, but it is one of the fastest ways to test whether you really understand recursive structure on trees. This guide uses LeetCode 226 to break down the essence of \u0026ldquo;swap left and right subtrees\u0026rdquo;, covers both recursion and BFS, and shows how the same idea transfers to engineering scenarios.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 8-10 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebinary tree\u003c/code\u003e, \u003ccode\u003erecursion\u003c/code\u003e, \u003ccode\u003eBFS\u003c/code\u003e, \u003ccode\u003etree transformation\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Invert Binary Tree, tree mirror, recursion, BFS, LeetCode 226\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Learn the recursive and BFS solutions for LeetCode 226, then extend the idea to layout mirroring and structural transformations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want to verify whether they truly understand \u0026ldquo;apply recursion to every node in the whole tree\u0026rdquo;\u003c/li\u003e\n\u003cli\u003eDevelopers who instinctively start traversing any tree problem, but are unsure when to process the current node\u003c/li\u003e\n\u003cli\u003eEngineers who need tree mirroring, layout inversion, or symmetric structural transforms\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThe code for LeetCode 226 is usually very short, but the thinking pattern is extremely typical:\u003c/p\u003e","title":"Hot100: Invert Binary Tree (Recursion / BFS ACERS Guide)"},{"content":" Subtitle / Summary\n\u0026ldquo;Maximum depth\u0026rdquo; is one of the cleanest starting points for tree recursion. Once you truly understand that the answer for the current tree depends on the answers from its left and right subtrees, a whole family of tree DP and DFS problems becomes easier. This guide uses LeetCode 104 to explain recursive DFS, level-order BFS, and the engineering value of the same pattern.\nReading time: 9-11 min Tags: Hot100, binary tree, DFS, BFS, recursion SEO keywords: Hot100, Maximum Depth of Binary Tree, DFS, BFS, LeetCode 104 Meta description: Learn the DFS and BFS solutions for LeetCode 104 from the definition of depth, with engineering mappings and runnable multi-language code. Target Readers Learners who are just starting tree problems and want to truly internalize \u0026ldquo;tree recursion return values\u0026rdquo; Developers who can write traversals but get confused once the task becomes \u0026ldquo;compute height\u0026rdquo;, \u0026ldquo;compute path\u0026rdquo;, or \u0026ldquo;compute an answer\u0026rdquo; Engineers who need depth analysis on hierarchical data such as menus, org charts, or nested JSON Background / Motivation LeetCode 104 looks like an easy problem, but it is almost the parent problem of tree recursion:\nyou first need to answer \u0026ldquo;what is the depth of an empty tree?\u0026rdquo; then answer \u0026ldquo;who determines the answer for the current node?\u0026rdquo; and finally write the relation as 1 + max(left, right) Once this recursive definition is built correctly, later problems such as balanced binary tree, tree diameter, path sums, and lowest common ancestor become much easier to enter.\nCore Concepts Depth / height: in this problem, the number of nodes on the longest path from root to the farthest leaf Postorder-style thinking: to know the answer for the current node, you must first know the answers of the left and right subtrees DFS: recurse downward and combine answers while backtracking BFS: traverse level by level; the last level number is the tree depth A - Algorithm (Problem and Algorithm) Problem Restatement Given the root node root of a binary tree, return its maximum depth.\nMaximum depth means the number of nodes along the longest path from the root down to the farthest leaf node.\nInput / Output Name Type Description root TreeNode root of the binary tree, may be null return int maximum depth of the tree Example 1 input: root = [3,9,20,null,null,15,7] output: 3 explanation: level 1: 3 level 2: 9, 20 level 3: 15, 7 so the maximum depth is 3. Example 2 input: root = [1,null,2] output: 2 Constraints The number of nodes is in the range [0, 10^4] -100 \u0026lt;= Node.val \u0026lt;= 100 C - Concepts (Core Ideas) Thought Process: Why the recursive formula is 1 + max(left, right) For any node node:\nif it is null, the depth is 0 if it is not null, then the maximum depth from that node is: 1 for the current level plus the deeper side between the left subtree and the right subtree So the state transition is direct:\ndepth(node) = 1 + max(depth(node.left), depth(node.right)) Method Category Tree recursion / DFS Level-order traversal / BFS Bottom-up answer combination in tree problems When DFS and BFS are each a good fit Recursive DFS\nshortest code matches the definition best ideal for most interviews and explanations Level-order BFS\nvery convenient for problems that are naturally layer-based if you also want the node distribution by level, BFS is more direct Why DFS is the recommended template here This problem does not ask you to print each level; it only asks for one final number.\nDFS writes the definition directly, gives the clearest expression, and has the lowest error rate.\nPractice Guide / Steps Recommended Approach: Recursive DFS If the node is null, return 0 Recursively compute the maximum depth of the left subtree Recursively compute the maximum depth of the right subtree Return 1 + max(leftDepth, rightDepth) Runnable Python example:\nclass TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def max_depth(root): if root is None: return 0 return 1 + max(max_depth(root.left), max_depth(root.right)) if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(3, TreeNode(9), TreeNode(20, TreeNode(15), TreeNode(7))) print(max_depth(root)) BFS Alternative If you prefer level-order traversal, you can also:\nUse a queue to hold the current level Increase depth after processing one full level Stop when the queue becomes empty This method is also common, especially when the problem also asks for level-by-level output.\nE - Engineering (Real-world Scenarios) Scenario 1: Maximum nesting depth of frontend menu config (JavaScript) Background: backend systems often allow menus to be configured as trees.\nWhy it fits: before release, you can check whether the menu exceeds the maximum nesting level allowed by the design.\nconst menu = { name: \u0026#34;root\u0026#34;, children: [ { name: \u0026#34;dashboard\u0026#34;, children: [] }, { name: \u0026#34;settings\u0026#34;, children: [{ name: \u0026#34;profile\u0026#34;, children: [] }] }, ], }; function depth(node) { if (!node) return 0; if (!node.children || node.children.length === 0) return 1; return 1 + Math.max(...node.children.map(depth)); } console.log(depth(menu)); Scenario 2: Longest reporting chain in an org chart (Go) Background: org charts and approval flows are often represented as trees.\nWhy it fits: maximum depth measures hierarchy complexity and helps with workflow optimization and permission design.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Name string Children []*Node } func depth(node *Node) int { if node == nil { return 0 } best := 0 for _, child := range node.Children { if d := depth(child); d \u0026gt; best { best = d } } return 1 + best } func main() { root := \u0026amp;Node{ Name: \u0026#34;CEO\u0026#34;, Children: []*Node{ { Name: \u0026#34;VP\u0026#34;, Children: []*Node{ {Name: \u0026#34;Manager\u0026#34;}, }, }, }, } fmt.Println(depth(root)) } Scenario 3: Maximum nesting validation for JSON (Python) Background: logs, configs, and ETL payloads often contain deeply nested JSON.\nWhy it fits: overly deep data hurts readability and downstream processing, so it is useful to enforce a depth limit at the input boundary.\ndef json_depth(x): if isinstance(x, dict): if not x: return 1 return 1 + max(json_depth(v) for v in x.values()) if isinstance(x, list): if not x: return 1 return 1 + max(json_depth(v) for v in x) return 1 data = {\u0026#34;a\u0026#34;: {\u0026#34;b\u0026#34;: {\u0026#34;c\u0026#34;: [1, {\u0026#34;d\u0026#34;: 2}]}}} print(json_depth(data)) R - Reflection (Analysis and Deeper Understanding) Complexity Analysis Time complexity: O(n), because each node is visited once Space complexity: DFS recursion: O(h) BFS queue: worst case O(n), or more precisely O(w) where w is the maximum tree width Alternative Approaches Method Time Extra Space Notes DFS recursion O(n) O(h) Matches the definition best and is recommended BFS level order O(n) O(w) Very convenient for level-based problems Explicit-stack DFS O(n) O(h) Useful if you do not want recursion Common Mistakes and Pitfalls Writing the depth of a null node as 1, which adds an extra level to the whole tree Mixing up \u0026ldquo;edge count\u0026rdquo; and \u0026ldquo;node count\u0026rdquo;; this problem counts nodes Recursing into only one side and forgetting max(left, right) Incrementing depth on every popped BFS node, which counts nodes instead of levels Common Questions and Notes 1. Is this preorder, inorder, or postorder? More accurately, it follows a postorder-style merge pattern, because the answer for the current node depends on the answers of its left and right subtrees.\n2. Which is better, DFS or BFS? If you only need one depth value, DFS is simpler. If you also want the nodes grouped by level, BFS is more natural.\n3. Can recursion overflow the stack? Yes, for extremely degenerate trees. In engineering situations where tree depth is unbounded, explicit stacks or BFS are safer.\nBest Practices and Suggestions Write the base case clearly first: what should the function return when node == null? For tree problems where the current answer depends on the left and right subtree answers, think of recursive return values first When writing complexity, distinguish O(h) from O(w) for a more accurate statement Being able to explain when DFS or BFS is appropriate matters more than simply memorizing code S - Summary The core of LeetCode 104 is not the code, but the definition of depth itself Once depth(node) = 1 + max(left, right) is clear, the recursive solution almost writes itself DFS is the most recommended template for this problem, while BFS is an excellent level-based alternative This problem is foundational for balanced-tree, diameter, and path-sum questions In engineering, any hierarchical structure with \u0026ldquo;maximum nesting depth\u0026rdquo; can reuse the same idea References and Further Reading LeetCode 104: Maximum Depth of Binary Tree LeetCode 111: Minimum Depth of Binary Tree LeetCode 110: Balanced Binary Tree LeetCode 543: Diameter of Binary Tree LeetCode 102: Binary Tree Level Order Traversal CTA It is worth practicing 104 together with 111.\nOne is about max, while the other often exposes tricky null-subtree handling; together they make your tree-recursion base cases much more stable.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def max_depth(root): if root is None: return 0 return 1 + max(max_depth(root.left), max_depth(root.right)) if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(3, TreeNode(9), TreeNode(20, TreeNode(15), TreeNode(7))) print(max_depth(root)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct TreeNode { int val; struct TreeNode* left; struct TreeNode* right; }; struct TreeNode* new_node(int val) { struct TreeNode* node = (struct TreeNode*)malloc(sizeof(struct TreeNode)); node-\u0026gt;val = val; node-\u0026gt;left = NULL; node-\u0026gt;right = NULL; return node; } int maxDepth(struct TreeNode* root) { if (root == NULL) return 0; int left = maxDepth(root-\u0026gt;left); int right = maxDepth(root-\u0026gt;right); return 1 + (left \u0026gt; right ? left : right); } void free_tree(struct TreeNode* root) { if (!root) return; free_tree(root-\u0026gt;left); free_tree(root-\u0026gt;right); free(root); } int main(void) { struct TreeNode* root = new_node(3); root-\u0026gt;left = new_node(9); root-\u0026gt;right = new_node(20); root-\u0026gt;right-\u0026gt;left = new_node(15); root-\u0026gt;right-\u0026gt;right = new_node(7); printf(\u0026#34;%d\\n\u0026#34;, maxDepth(root)); free_tree(root); return 0; } #include \u0026lt;algorithm\u0026gt; #include \u0026lt;iostream\u0026gt; struct TreeNode { int val; TreeNode* left; TreeNode* right; explicit TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} }; int maxDepth(TreeNode* root) { if (!root) return 0; int left = maxDepth(root-\u0026gt;left); int right = maxDepth(root-\u0026gt;right); return 1 + std::max(left, right); } void freeTree(TreeNode* root) { if (!root) return; freeTree(root-\u0026gt;left); freeTree(root-\u0026gt;right); delete root; } int main() { TreeNode* root = new TreeNode(3); root-\u0026gt;left = new TreeNode(9); root-\u0026gt;right = new TreeNode(20); root-\u0026gt;right-\u0026gt;left = new TreeNode(15); root-\u0026gt;right-\u0026gt;right = new TreeNode(7); std::cout \u0026lt;\u0026lt; maxDepth(root) \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; freeTree(root); return 0; } package main import \u0026#34;fmt\u0026#34; type TreeNode struct { Val int Left *TreeNode Right *TreeNode } func maxDepth(root *TreeNode) int { if root == nil { return 0 } left := maxDepth(root.Left) right := maxDepth(root.Right) if left \u0026gt; right { return 1 + left } return 1 + right } func main() { root := \u0026amp;TreeNode{ Val: 3, Left: \u0026amp;TreeNode{Val: 9}, Right: \u0026amp;TreeNode{ Val: 20, Left: \u0026amp;TreeNode{Val: 15}, Right: \u0026amp;TreeNode{Val: 7}, }, } fmt.Println(maxDepth(root)) } #[derive(Debug)] struct TreeNode { val: i32, left: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, right: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, } fn max_depth(root: \u0026amp;Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;) -\u0026gt; i32 { match root { None =\u0026gt; 0, Some(node) =\u0026gt; 1 + max_depth(\u0026amp;node.left).max(max_depth(\u0026amp;node.right)), } } fn main() { let root = Some(Box::new(TreeNode { val: 3, left: Some(Box::new(TreeNode { val: 9, left: None, right: None, })), right: Some(Box::new(TreeNode { val: 20, left: Some(Box::new(TreeNode { val: 15, left: None, right: None, })), right: Some(Box::new(TreeNode { val: 7, left: None, right: None, })), })), })); println!(\u0026#34;{}\u0026#34;, max_depth(\u0026amp;root)); } function TreeNode(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function maxDepth(root) { if (!root) return 0; return 1 + Math.max(maxDepth(root.left), maxDepth(root.right)); } const root = new TreeNode( 3, new TreeNode(9), new TreeNode(20, new TreeNode(15), new TreeNode(7)) ); console.log(maxDepth(root)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/binary-tree/104-maximum-depth-of-binary-tree/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\n\u0026ldquo;Maximum depth\u0026rdquo; is one of the cleanest starting points for tree recursion. Once you truly understand that the answer for the current tree depends on the answers from its left and right subtrees, a whole family of tree DP and DFS problems becomes easier. This guide uses LeetCode 104 to explain recursive DFS, level-order BFS, and the engineering value of the same pattern.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 9-11 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebinary tree\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e, \u003ccode\u003eBFS\u003c/code\u003e, \u003ccode\u003erecursion\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Maximum Depth of Binary Tree, DFS, BFS, LeetCode 104\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Learn the DFS and BFS solutions for LeetCode 104 from the definition of depth, with engineering mappings and runnable multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners who are just starting tree problems and want to truly internalize \u0026ldquo;tree recursion return values\u0026rdquo;\u003c/li\u003e\n\u003cli\u003eDevelopers who can write traversals but get confused once the task becomes \u0026ldquo;compute height\u0026rdquo;, \u0026ldquo;compute path\u0026rdquo;, or \u0026ldquo;compute an answer\u0026rdquo;\u003c/li\u003e\n\u003cli\u003eEngineers who need depth analysis on hierarchical data such as menus, org charts, or nested JSON\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eLeetCode 104 looks like an easy problem, but it is almost the parent problem of tree recursion:\u003c/p\u003e","title":"Hot100: Maximum Depth of Binary Tree (DFS / BFS ACERS Guide)"},{"content":" Subtitle / Summary\nBinary tree traversal is the starting point of most tree templates, and inorder traversal is one of the cleanest problems for understanding both recursive thinking and explicit stack simulation. This ACERS guide uses LeetCode 94 to explain the left-root-right order, the iterative stack template, and why the pattern matters in real engineering work.\nReading time: 10-12 min Tags: Hot100, binary tree, DFS, stack, inorder traversal SEO keywords: Hot100, Binary Tree Inorder Traversal, inorder traversal, explicit stack, LeetCode 94 Meta description: A systematic guide to LeetCode 94 from recursion to explicit stacks, with engineering scenarios and runnable multi-language implementations. Target Readers Hot100 learners who want to lock in a stable tree-traversal template Developers moving from arrays and linked lists to trees, and still mixing up preorder, inorder, and postorder Engineers who want to reuse the left-root-right idea in BSTs, expression trees, or syntax trees Background / Motivation Inorder traversal is not hard by itself, but its training value is high:\nit is one of the easiest problems for building intuition that recursion = implicit stack, while iteration = explicit stack it helps you internalize the process of \u0026ldquo;go left all the way, backtrack to visit the root, then move into the right subtree\u0026rdquo; in a binary search tree (BST), inorder traversal naturally produces a sorted sequence, so the engineering value is very real When many people first solve tree problems, the issue is not the logic itself, but:\nnot being sure which node gets visited first not knowing exactly when to push and pop in the iterative version getting the code tangled when the tree is empty or degenerates into a one-sided chain If you master this template, later problems like validating a BST, finding the k-th smallest element, or recovering a BST become much smoother.\nCore Concepts Inorder traversal: visit in the order left subtree -\u0026gt; root node -\u0026gt; right subtree DFS (depth-first search): the most common organization pattern for tree traversal; inorder is one specific visitation order Explicit stack: manually simulate the recursion call stack by storing nodes you still need to come back to Tree height h: space complexity is usually written as O(h); for balanced trees this is about O(log n), and for a degenerate chain it becomes O(n) A - Algorithm (Problem and Algorithm) Problem Restatement Given the root node root of a binary tree, return its inorder traversal result.\nInput / Output Name Type Description root TreeNode root of the binary tree, may be null return int[] / List[int] node values in inorder sequence Example 1 input: root = [1,null,2,3] output: [1,3,2] explanation: 1 \\ 2 / 3 The inorder order is left -\u0026gt; root -\u0026gt; right, so the answer is [1,3,2]. Example 2 input: root = [] output: [] Example 3 input: root = [1] output: [1] Constraints The number of nodes is in the range [0, 100] -100 \u0026lt;= Node.val \u0026lt;= 100 C - Concepts (Core Ideas) Thought Process: From recursive definition to explicit stack template The most natural form is recursion\nFor each node node:\ntraverse the left subtree first visit the current node traverse the right subtree last That matches the definition of inorder exactly, so the code is very short.\nBut interviews often ask: can you do it without recursion?\nSince recursion relies on the function call stack, interviewers often want you to write that process out explicitly.\nWhy do we keep pushing nodes while going left?\nBecause inorder requires the left subtree to be processed first. So as long as the current node is not null, we push it and continue to left.\nOnce we hit null, the leftmost chain is exhausted, and the top of the stack is exactly the next root we should visit.\nMethod Category Tree DFS Recursive traversal Stack-based recursion simulation Explicit Stack Template The iterative version can be remembered in four stable steps:\ncur = root While cur != null, keep pushing and move left After the left side ends, pop the stack top and record its value Move cur to the right subtree of the popped node, then repeat Pseudo flow:\nwhile cur is not null or stack is not empty: while cur is not null: stack.push(cur) cur = cur.left cur = stack.pop() record cur.val cur = cur.right Why this order is always correct Each node is visited exactly once during left-chain backtracking The left subtree always finishes before the node itself The right subtree only starts after the root node has been visited That is exactly equivalent to the definition of inorder traversal, so the result is correct.\nPractice Guide / Steps Recommended Approach: Iterative explicit stack Prepare the result array res and a stack stack Start cur from the root Keep pushing the left chain Pop and visit the root Move into the right subtree Stop when both the stack is empty and cur is null Runnable Python example:\nclass TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def inorder_traversal(root): res = [] stack = [] cur = root while cur is not None or stack: while cur is not None: stack.append(cur) cur = cur.left cur = stack.pop() res.append(cur.val) cur = cur.right return res if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(1, None, TreeNode(2, TreeNode(3), None)) print(inorder_traversal(root)) E - Engineering (Real-world Scenarios) Scenario 1: Export sorted primary keys from a BST (Python) Background: many in-memory indexes, cache dictionaries, and teaching-oriented search trees store data in BST form.\nWhy it fits: inorder traversal of a BST naturally yields ascending order, so it is useful for audit exports or debug snapshots.\nclass Node: def __init__(self, key, left=None, right=None): self.key = key self.left = left self.right = right def inorder(node, out): if node is None: return inorder(node.left, out) out.append(node.key) inorder(node.right, out) root = Node(5, Node(3, Node(2), Node(4)), Node(7)) result = [] inorder(root, result) print(result) Scenario 2: Convert an expression tree to infix notation (JavaScript) Background: compilers, formula editors, and rule engines often organize expressions as binary trees.\nWhy it fits: inorder traversal naturally matches the reading order of infix expressions, which makes the result more human-friendly.\nfunction Node(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function inorder(node) { if (!node) return \u0026#34;\u0026#34;; if (!node.left \u0026amp;\u0026amp; !node.right) return String(node.val); return `(${inorder(node.left)} ${node.val} ${inorder(node.right)})`; } const tree = new Node(\u0026#34;*\u0026#34;, new Node(\u0026#34;+\u0026#34;, new Node(1), new Node(2)), new Node(3)); console.log(inorder(tree)); Scenario 3: Inspect local ordering in a tree-based config (Go) Background: some rule systems use \u0026ldquo;left branch / current node / right branch\u0026rdquo; as a stable manual inspection order.\nWhy it fits: inorder traversal lets developers inspect nodes in a fixed local order, which helps with diffs and manual verification.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Name string Left *Node Right *Node } func inorder(node *Node, out *[]string) { if node == nil { return } inorder(node.Left, out) *out = append(*out, node.Name) inorder(node.Right, out) } func main() { root := \u0026amp;Node{\u0026#34;root\u0026#34;, \u0026amp;Node{\u0026#34;L\u0026#34;, nil, nil}, \u0026amp;Node{\u0026#34;R\u0026#34;, nil, nil}} order := []string{} inorder(root, \u0026amp;order) fmt.Println(order) } R - Reflection (Analysis and Deeper Understanding) Complexity Analysis Time complexity: O(n), because each node is processed exactly once Space complexity: Recursive version: O(h) call stack Explicit-stack version: O(h) auxiliary stack Alternative Approaches Method Time Extra Space Notes Recursion O(n) O(h) Most intuitive and shortest Explicit stack O(n) O(h) Most common interview template and highly reusable Morris traversal O(n) O(1) Temporarily modifies tree structure and is harder to reason about Common Mistakes and Pitfalls Mixing up the visitation points of preorder, inorder, and postorder Forgetting to move to cur.right after popping in the iterative version Writing only while cur != null and missing the case where nodes are still waiting in the stack Accessing node.left directly in recursion without a null check first Common Questions and Notes 1. Is inorder traversal always sorted? No. It is sorted only when the tree satisfies the BST property.\n2. Which is more recommended, recursion or iteration? In interviews, you should know both. Early on, recursion is the best way to build the definition in your head; after that, the explicit stack template is the most stable pattern to memorize.\n3. Is Morris traversal worth memorizing? It is worth understanding, but it should not be your first priority at the fundamentals stage. Get recursion and explicit stacks stable first.\nBest Practices and Suggestions Memorize the definition in one sentence: left, root, right For the iterative template, remember: \u0026ldquo;push left chain -\u0026gt; pop and visit -\u0026gt; move right\u0026rdquo; Whenever you see BST, think of \u0026ldquo;inorder = sorted\u0026rdquo; For tree problems, writing space as O(h) is often more accurate than writing only O(n) S - Summary The core of inorder traversal is the fixed visitation order: left -\u0026gt; root -\u0026gt; right Recursion matches the definition best, while the explicit stack version is the best interview template This problem trains two core abilities: tree recursion and manual simulation of the call stack In BSTs, expression trees, and configuration trees, inorder thinking has practical engineering value Once you can write 94 smoothly, BST validation and k-th-smallest problems become much easier References and Further Reading LeetCode 94: Binary Tree Inorder Traversal LeetCode 144: Binary Tree Preorder Traversal LeetCode 145: Binary Tree Postorder Traversal LeetCode 98: Validate Binary Search Tree LeetCode 230: Kth Smallest Element in a BST CTA First handwrite the recursive version, then rewrite the explicit-stack version without looking at the answer.\nIf you can reliably finish LeetCode 94 in about three minutes, your tree-traversal fundamentals are already in place.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) class TreeNode: def __init__(self, val=0, left=None, right=None): self.val = val self.left = left self.right = right def inorder_traversal(root): res = [] stack = [] cur = root while cur is not None or stack: while cur is not None: stack.append(cur) cur = cur.left cur = stack.pop() res.append(cur.val) cur = cur.right return res if __name__ == \u0026#34;__main__\u0026#34;: root = TreeNode(1, None, TreeNode(2, TreeNode(3), None)) print(inorder_traversal(root)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct TreeNode { int val; struct TreeNode* left; struct TreeNode* right; }; struct TreeNode* new_node(int val) { struct TreeNode* node = (struct TreeNode*)malloc(sizeof(struct TreeNode)); node-\u0026gt;val = val; node-\u0026gt;left = NULL; node-\u0026gt;right = NULL; return node; } int* inorderTraversal(struct TreeNode* root, int* returnSize) { struct TreeNode* stack[128]; int top = 0; int* res = (int*)malloc(sizeof(int) * 128); *returnSize = 0; struct TreeNode* cur = root; while (cur != NULL || top \u0026gt; 0) { while (cur != NULL) { stack[top++] = cur; cur = cur-\u0026gt;left; } cur = stack[--top]; res[(*returnSize)++] = cur-\u0026gt;val; cur = cur-\u0026gt;right; } return res; } void free_tree(struct TreeNode* root) { if (!root) return; free_tree(root-\u0026gt;left); free_tree(root-\u0026gt;right); free(root); } int main(void) { struct TreeNode* root = new_node(1); root-\u0026gt;right = new_node(2); root-\u0026gt;right-\u0026gt;left = new_node(3); int n = 0; int* ans = inorderTraversal(root, \u0026amp;n); for (int i = 0; i \u0026lt; n; ++i) { printf(\u0026#34;%d%s\u0026#34;, ans[i], i + 1 == n ? \u0026#34;\\n\u0026#34; : \u0026#34; \u0026#34;); } free(ans); free_tree(root); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;stack\u0026gt; #include \u0026lt;vector\u0026gt; struct TreeNode { int val; TreeNode* left; TreeNode* right; explicit TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} }; std::vector\u0026lt;int\u0026gt; inorderTraversal(TreeNode* root) { std::vector\u0026lt;int\u0026gt; res; std::stack\u0026lt;TreeNode*\u0026gt; st; TreeNode* cur = root; while (cur || !st.empty()) { while (cur) { st.push(cur); cur = cur-\u0026gt;left; } cur = st.top(); st.pop(); res.push_back(cur-\u0026gt;val); cur = cur-\u0026gt;right; } return res; } void freeTree(TreeNode* root) { if (!root) return; freeTree(root-\u0026gt;left); freeTree(root-\u0026gt;right); delete root; } int main() { TreeNode* root = new TreeNode(1); root-\u0026gt;right = new TreeNode(2); root-\u0026gt;right-\u0026gt;left = new TreeNode(3); auto ans = inorderTraversal(root); for (size_t i = 0; i \u0026lt; ans.size(); ++i) { std::cout \u0026lt;\u0026lt; ans[i] \u0026lt;\u0026lt; (i + 1 == ans.size() ? \u0026#39;\\n\u0026#39; : \u0026#39; \u0026#39;); } freeTree(root); return 0; } package main import \u0026#34;fmt\u0026#34; type TreeNode struct { Val int Left *TreeNode Right *TreeNode } func inorderTraversal(root *TreeNode) []int { res := []int{} stack := []*TreeNode{} cur := root for cur != nil || len(stack) \u0026gt; 0 { for cur != nil { stack = append(stack, cur) cur = cur.Left } cur = stack[len(stack)-1] stack = stack[:len(stack)-1] res = append(res, cur.Val) cur = cur.Right } return res } func main() { root := \u0026amp;TreeNode{Val: 1} root.Right = \u0026amp;TreeNode{Val: 2, Left: \u0026amp;TreeNode{Val: 3}} fmt.Println(inorderTraversal(root)) } #[derive(Debug)] struct TreeNode { val: i32, left: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, right: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, } fn inorder_traversal(root: \u0026amp;Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;) -\u0026gt; Vec\u0026lt;i32\u0026gt; { fn dfs(node: \u0026amp;Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, res: \u0026amp;mut Vec\u0026lt;i32\u0026gt;) { if let Some(node) = node { dfs(\u0026amp;node.left, res); res.push(node.val); dfs(\u0026amp;node.right, res); } } let mut res = vec![]; dfs(root, \u0026amp;mut res); res } fn main() { let root = Some(Box::new(TreeNode { val: 1, left: None, right: Some(Box::new(TreeNode { val: 2, left: Some(Box::new(TreeNode { val: 3, left: None, right: None, })), right: None, })), })); println!(\u0026#34;{:?}\u0026#34;, inorder_traversal(\u0026amp;root)); } function TreeNode(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function inorderTraversal(root) { const res = []; const stack = []; let cur = root; while (cur || stack.length) { while (cur) { stack.push(cur); cur = cur.left; } cur = stack.pop(); res.push(cur.val); cur = cur.right; } return res; } const root = new TreeNode(1, null, new TreeNode(2, new TreeNode(3), null)); console.log(inorderTraversal(root)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/binary-tree/94-binary-tree-inorder-traversal/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nBinary tree traversal is the starting point of most tree templates, and inorder traversal is one of the cleanest problems for understanding both recursive thinking and explicit stack simulation. This ACERS guide uses LeetCode 94 to explain the left-root-right order, the iterative stack template, and why the pattern matters in real engineering work.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ebinary tree\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e, \u003ccode\u003estack\u003c/code\u003e, \u003ccode\u003einorder traversal\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Binary Tree Inorder Traversal, inorder traversal, explicit stack, LeetCode 94\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A systematic guide to LeetCode 94 from recursion to explicit stacks, with engineering scenarios and runnable multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want to lock in a stable tree-traversal template\u003c/li\u003e\n\u003cli\u003eDevelopers moving from arrays and linked lists to trees, and still mixing up preorder, inorder, and postorder\u003c/li\u003e\n\u003cli\u003eEngineers who want to reuse the left-root-right idea in BSTs, expression trees, or syntax trees\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eInorder traversal is not hard by itself, but its training value is high:\u003c/p\u003e","title":"Hot100: Binary Tree Inorder Traversal (Recursion / Stack ACERS Guide)"},{"content":" Subtitle / Summary\nThis is not a memorization question. It is core cache-engineering practice: satisfy fast lookup and least-recently-used eviction at the same time, both in constant average time. We derive the optimal structure from naive approaches and provide runnable implementations.\nReading time: 14-18 min Tags: LRU, hash map, doubly linked list, system design SEO keywords: LRU Cache, LeetCode 146, hash map, doubly linked list, O(1) Meta description: Build an LRU cache with hash map + doubly linked list to achieve O(1) average get/put, with engineering use cases, pitfalls, and six-language implementations. Target Readers LeetCode learners who want to master data-structure composition Backend/middleware engineers implementing local caches Interview candidates who know the answer headline but not the invariants Background / Motivation Caching trades space for time, but cache space is limited.\nWhen full, we must evict keys. LRU (Least Recently Used) assumes:\nRecently accessed data is more likely to be accessed again Long-idle data is a better eviction candidate Real-world examples:\nAPI response caching Database hot-record caching Local page/session state caching Core Concepts LRU policy: evict the least recently used key Access refresh: successful get must make key most recently used Capacity constraint: put on new key may trigger immediate eviction O(1) average complexity: neither get nor put can do linear scans A — Algorithm (Problem \u0026amp; Algorithm) Problem Restatement Design and implement LRUCache:\nLRUCache(int capacity): initialize with positive capacity int get(int key): return value if key exists, else -1 void put(int key, int value): key exists: update value and mark it as most recently used key not exists: insert key-value over capacity: evict least recently used key Both operations must run in average O(1) time.\nExample 1 (Operation Sequence) LRUCache cache = new LRUCache(2) cache.put(1, 1) // cache: {1=1} cache.put(2, 2) // cache: {1=1, 2=2} cache.get(1) // return 1, and 1 becomes most recent cache.put(3, 3) // capacity full, evict key=2 cache.get(2) // return -1 cache.put(4, 4) // evict key=1 cache.get(1) // return -1 cache.get(3) // return 3 cache.get(4) // return 4 Example 2 (Update Existing Key) LRUCache cache = new LRUCache(2) cache.put(1, 10) cache.put(1, 99) // update value, key=1 is refreshed as most recent cache.get(1) // return 99 Derivation: from Naive to Optimal Naive approach 1: array tracks recency order get: hash lookup O(1), but moving key to newest in array is O(n) put: eviction can be O(1), but recency updates are often O(n) Conclusion: cannot guarantee O(1) for both operations.\nNaive approach 2: linked list only O(1) insert/delete at ends key lookup still O(n) Conclusion: lookup too slow.\nKey Observation We need both:\nFast key-to-node location -\u0026gt; hash map Fast recency reordering -\u0026gt; doubly linked list Method Choice (Optimal) Hash map: key -\u0026gt; node pointer/iterator Doubly linked list: head side = most recently used (MRU) tail side = least recently used (LRU) Operations:\nget(key): if hit, move node to front put(key,value): exists: update and move to front not exists: if full, remove tail node; then insert at front C — Concepts (Core Ideas) Data Structure Model HashMap: key -\u0026gt; Node* DoubleList: head \u0026lt;-\u0026gt; n1 \u0026lt;-\u0026gt; n2 \u0026lt;-\u0026gt; ... \u0026lt;-\u0026gt; nk \u0026lt;-\u0026gt; tail ^ MRU LRU ^ Invariants List order is always newest -\u0026gt; oldest Every key in map points to exactly one list node list_size == map_size Atomic Operations remove(node): unlink any known node in O(1) add_front(node): insert at front in O(1) move_to_front(node): remove + add_front in O(1) pop_back(): remove least recent node in O(1) Practice Guide / Steps Define doubly-linked node with key, value, prev, next Create head/tail sentinels to avoid edge-case branches Store key -\u0026gt; node in map On get hit, move node to front On new put, evict from back if full, then insert front Always insert/refresh at front to represent recent usage Minimal runnable Python example:\nclass Node: def __init__(self, key=0, val=0): self.key = key self.val = val self.prev = None self.next = None class LRUCache: def __init__(self, capacity: int): self.cap = capacity self.map = {} self.head = Node() # MRU side sentinel self.tail = Node() # LRU side sentinel self.head.next = self.tail self.tail.prev = self.head def _remove(self, node: Node) -\u0026gt; None: p, n = node.prev, node.next p.next = n n.prev = p def _add_front(self, node: Node) -\u0026gt; None: node.prev = self.head node.next = self.head.next self.head.next.prev = node self.head.next = node def _move_front(self, node: Node) -\u0026gt; None: self._remove(node) self._add_front(node) def _pop_lru(self) -\u0026gt; Node: node = self.tail.prev self._remove(node) return node def get(self, key: int) -\u0026gt; int: node = self.map.get(key) if node is None: return -1 self._move_front(node) return node.val def put(self, key: int, value: int) -\u0026gt; None: if self.cap == 0: return node = self.map.get(key) if node is not None: node.val = value self._move_front(node) return if len(self.map) == self.cap: old = self._pop_lru() del self.map[old.key] node = Node(key, value) self.map[key] = node self._add_front(node) if __name__ == \u0026#34;__main__\u0026#34;: c = LRUCache(2) c.put(1, 1) c.put(2, 2) print(c.get(1)) # 1 c.put(3, 3) print(c.get(2)) # -1 E — Engineering (Real-world Scenarios) Scenario 1: short-term API response cache (Python) Background: hot API endpoints receive repeated requests with same parameters.\nWhy it fits: recently accessed keys are likely to be reused, while capacity stays bounded.\nimport time cache = {} def fetch_user_profile(uid: int) -\u0026gt; dict: # Simulate slow query time.sleep(0.02) return {\u0026#34;uid\u0026#34;: uid, \u0026#34;name\u0026#34;: f\u0026#34;user-{uid}\u0026#34;} print(fetch_user_profile(7)) Scenario 2: config-center local cache in services (Go) Background: microservices read config frequently; remote fetch has network overhead.\nWhy it fits: recently used config keys are more likely to be accessed again.\npackage main import \u0026#34;fmt\u0026#34; func main() { // In production, LRU can be a layer in your config client. fmt.Println(\u0026#34;config cache ready with LRU policy\u0026#34;) } Scenario 3: frontend page-data cache (JavaScript) Background: SPA route switching benefits from reusing recently visited data.\nWhy it fits: recently viewed pages have higher revisit probability.\nconst pageState = new Map(); pageState.set(\u0026#34;feed?page=1\u0026#34;, { items: [1, 2, 3] }); console.log(pageState.get(\u0026#34;feed?page=1\u0026#34;)); R — Reflection (Deep Dive) Complexity get: hash lookup + list move, average O(1) put: hash lookup/insert + list insert/delete, average O(1) Space: O(capacity) Alternative Comparison Approach get put Problem Hash map + timestamp only O(1) eviction often needs O(n) scan slow eviction Linked list only O(n) O(1) slow lookup Hash map + doubly linked list O(1) O(1) slightly more implementation detail Common Mistakes Hit in get but forget to refresh recency (not moving to front) Existing key in put updates value but not recency Evict from list but forget to remove map entry (stale pointer) Forgetting the capacity == 0 edge case Why this method is production-friendly Stable performance and predictable constant-time behavior Easy extensibility: TTL, hit-rate metrics, lock wrappers Clear invariants and atomic operations for testing/debugging FAQ Q1: Why doubly linked list, not singly linked list? Removing an arbitrary node in singly linked list needs predecessor lookup, usually O(n). Doubly linked list deletes known nodes in O(1).\nQ2: Why store key inside each list node? When evicting tail node, we must remove its key from map in O(1). Without storing key in node, that step becomes expensive.\nQ3: How is this different from LFU? LRU evicts by recency; LFU evicts by frequency. LFU requires more complex structures and update logic.\nBest Practices Always use head/tail sentinels to avoid fragile boundary branches Keep atomic list ops private: remove/add_front/move/pop_back Test operation sequences, not only final state snapshots Lock correctness first, then optimize concurrency granularity S — Summary Key takeaways:\nLRU is recency ordering + fixed-capacity eviction. O(1) requires data-structure composition, not a single structure. Stability comes from invariants: order consistency, map consistency, capacity consistency. This model maps directly to practical caches in backend and frontend systems. It is a strong base for TTL-LRU, concurrent LRU, and LFU extensions. Recommended follow-ups:\nLeetCode 460 LFU Cache Redis eviction policy docs (allkeys-lru, volatile-lru) Designing Data-Intensive Applications caching chapters System design materials on local cache consistency strategies Runnable Multi-language Implementations Python class Node: def __init__(self, key=0, val=0): self.key = key self.val = val self.prev = None self.next = None class LRUCache: def __init__(self, capacity: int): self.cap = capacity self.map = {} self.head = Node() self.tail = Node() self.head.next = self.tail self.tail.prev = self.head def _remove(self, node: Node) -\u0026gt; None: p, n = node.prev, node.next p.next = n n.prev = p def _add_front(self, node: Node) -\u0026gt; None: node.prev = self.head node.next = self.head.next self.head.next.prev = node self.head.next = node def _move_front(self, node: Node) -\u0026gt; None: self._remove(node) self._add_front(node) def _pop_lru(self) -\u0026gt; Node: node = self.tail.prev self._remove(node) return node def get(self, key: int) -\u0026gt; int: node = self.map.get(key) if node is None: return -1 self._move_front(node) return node.val def put(self, key: int, value: int) -\u0026gt; None: if self.cap == 0: return node = self.map.get(key) if node: node.val = value self._move_front(node) return if len(self.map) == self.cap: old = self._pop_lru() del self.map[old.key] node = Node(key, value) self.map[key] = node self._add_front(node) if __name__ == \u0026#34;__main__\u0026#34;: c = LRUCache(2) c.put(1, 1) c.put(2, 2) print(c.get(1)) # 1 c.put(3, 3) print(c.get(2)) # -1 c.put(4, 4) print(c.get(1), c.get(3), c.get(4)) # -1 3 4 C #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; #define HASH_SIZE 4093 typedef struct Node { int key; int val; struct Node* prev; struct Node* next; } Node; typedef struct Entry { int key; Node* node; struct Entry* next; } Entry; typedef struct { Entry* buckets[HASH_SIZE]; } HashMap; typedef struct { int cap; int size; HashMap map; Node head; Node tail; } LRUCache; unsigned int h(int key) { unsigned int x = (unsigned int)key; return (x * 2654435761u) % HASH_SIZE; } Node* map_get(HashMap* m, int key) { unsigned int idx = h(key); Entry* e = m-\u0026gt;buckets[idx]; while (e) { if (e-\u0026gt;key == key) return e-\u0026gt;node; e = e-\u0026gt;next; } return NULL; } void map_put(HashMap* m, int key, Node* node) { unsigned int idx = h(key); Entry* e = m-\u0026gt;buckets[idx]; while (e) { if (e-\u0026gt;key == key) { e-\u0026gt;node = node; return; } e = e-\u0026gt;next; } Entry* ne = (Entry*)malloc(sizeof(Entry)); ne-\u0026gt;key = key; ne-\u0026gt;node = node; ne-\u0026gt;next = m-\u0026gt;buckets[idx]; m-\u0026gt;buckets[idx] = ne; } void map_remove(HashMap* m, int key) { unsigned int idx = h(key); Entry* cur = m-\u0026gt;buckets[idx]; Entry* pre = NULL; while (cur) { if (cur-\u0026gt;key == key) { if (pre) pre-\u0026gt;next = cur-\u0026gt;next; else m-\u0026gt;buckets[idx] = cur-\u0026gt;next; free(cur); return; } pre = cur; cur = cur-\u0026gt;next; } } void list_remove(Node* n) { n-\u0026gt;prev-\u0026gt;next = n-\u0026gt;next; n-\u0026gt;next-\u0026gt;prev = n-\u0026gt;prev; } void list_add_front(LRUCache* c, Node* n) { n-\u0026gt;prev = \u0026amp;c-\u0026gt;head; n-\u0026gt;next = c-\u0026gt;head.next; c-\u0026gt;head.next-\u0026gt;prev = n; c-\u0026gt;head.next = n; } void move_front(LRUCache* c, Node* n) { list_remove(n); list_add_front(c, n); } Node* pop_lru(LRUCache* c) { Node* n = c-\u0026gt;tail.prev; list_remove(n); return n; } LRUCache* lruCreate(int capacity) { LRUCache* c = (LRUCache*)calloc(1, sizeof(LRUCache)); c-\u0026gt;cap = capacity; c-\u0026gt;size = 0; c-\u0026gt;head.next = \u0026amp;c-\u0026gt;tail; c-\u0026gt;tail.prev = \u0026amp;c-\u0026gt;head; return c; } int lruGet(LRUCache* c, int key) { Node* n = map_get(\u0026amp;c-\u0026gt;map, key); if (!n) return -1; move_front(c, n); return n-\u0026gt;val; } void lruPut(LRUCache* c, int key, int value) { if (c-\u0026gt;cap == 0) return; Node* n = map_get(\u0026amp;c-\u0026gt;map, key); if (n) { n-\u0026gt;val = value; move_front(c, n); return; } if (c-\u0026gt;size == c-\u0026gt;cap) { Node* old = pop_lru(c); map_remove(\u0026amp;c-\u0026gt;map, old-\u0026gt;key); free(old); c-\u0026gt;size--; } Node* nn = (Node*)malloc(sizeof(Node)); nn-\u0026gt;key = key; nn-\u0026gt;val = value; list_add_front(c, nn); map_put(\u0026amp;c-\u0026gt;map, key, nn); c-\u0026gt;size++; } void lruFree(LRUCache* c) { Node* cur = c-\u0026gt;head.next; while (cur != \u0026amp;c-\u0026gt;tail) { Node* nxt = cur-\u0026gt;next; free(cur); cur = nxt; } for (int i = 0; i \u0026lt; HASH_SIZE; i++) { Entry* e = c-\u0026gt;map.buckets[i]; while (e) { Entry* ne = e-\u0026gt;next; free(e); e = ne; } } free(c); } int main(void) { LRUCache* c = lruCreate(2); lruPut(c, 1, 1); lruPut(c, 2, 2); printf(\u0026#34;%d\\n\u0026#34;, lruGet(c, 1)); // 1 lruPut(c, 3, 3); printf(\u0026#34;%d\\n\u0026#34;, lruGet(c, 2)); // -1 lruPut(c, 4, 4); printf(\u0026#34;%d %d %d\\n\u0026#34;, lruGet(c, 1), lruGet(c, 3), lruGet(c, 4)); // -1 3 4 lruFree(c); return 0; } C++ #include \u0026lt;iostream\u0026gt; #include \u0026lt;list\u0026gt; #include \u0026lt;unordered_map\u0026gt; using namespace std; class LRUCache { private: int cap; list\u0026lt;pair\u0026lt;int, int\u0026gt;\u0026gt; dq; // front = MRU, back = LRU unordered_map\u0026lt;int, list\u0026lt;pair\u0026lt;int, int\u0026gt;\u0026gt;::iterator\u0026gt; pos; public: explicit LRUCache(int capacity) : cap(capacity) {} int get(int key) { auto it = pos.find(key); if (it == pos.end()) return -1; dq.splice(dq.begin(), dq, it-\u0026gt;second); return it-\u0026gt;second-\u0026gt;second; } void put(int key, int value) { if (cap == 0) return; auto it = pos.find(key); if (it != pos.end()) { it-\u0026gt;second-\u0026gt;second = value; dq.splice(dq.begin(), dq, it-\u0026gt;second); return; } if ((int)dq.size() == cap) { int oldKey = dq.back().first; pos.erase(oldKey); dq.pop_back(); } dq.push_front({key, value}); pos[key] = dq.begin(); } }; int main() { LRUCache c(2); c.put(1, 1); c.put(2, 2); cout \u0026lt;\u0026lt; c.get(1) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; // 1 c.put(3, 3); cout \u0026lt;\u0026lt; c.get(2) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; // -1 c.put(4, 4); cout \u0026lt;\u0026lt; c.get(1) \u0026lt;\u0026lt; \u0026#34; \u0026#34; \u0026lt;\u0026lt; c.get(3) \u0026lt;\u0026lt; \u0026#34; \u0026#34; \u0026lt;\u0026lt; c.get(4) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; // -1 3 4 return 0; } Go package main import ( \u0026#34;container/list\u0026#34; \u0026#34;fmt\u0026#34; ) type entry struct { key int value int } type LRUCache struct { cap int ll *list.List pos map[int]*list.Element } func Constructor(capacity int) LRUCache { return LRUCache{ cap: capacity, ll: list.New(), pos: make(map[int]*list.Element), } } func (c *LRUCache) Get(key int) int { e, ok := c.pos[key] if !ok { return -1 } c.ll.MoveToFront(e) return e.Value.(entry).value } func (c *LRUCache) Put(key int, value int) { if c.cap == 0 { return } if e, ok := c.pos[key]; ok { e.Value = entry{key: key, value: value} c.ll.MoveToFront(e) return } if c.ll.Len() == c.cap { back := c.ll.Back() old := back.Value.(entry) delete(c.pos, old.key) c.ll.Remove(back) } e := c.ll.PushFront(entry{key: key, value: value}) c.pos[key] = e } func main() { c := Constructor(2) c.Put(1, 1) c.Put(2, 2) fmt.Println(c.Get(1)) // 1 c.Put(3, 3) fmt.Println(c.Get(2)) // -1 c.Put(4, 4) fmt.Println(c.Get(1), c.Get(3), c.Get(4)) // -1 3 4 } Rust use std::collections::HashMap; #[derive(Clone, Debug)] struct Node { key: i32, val: i32, prev: usize, next: usize, } struct LRUCache { cap: usize, len: usize, map: HashMap\u0026lt;i32, usize\u0026gt;, // key -\u0026gt; node index nodes: Vec\u0026lt;Node\u0026gt;, free: Vec\u0026lt;usize\u0026gt;, head: usize, // sentinel tail: usize, // sentinel } impl LRUCache { fn new(capacity: i32) -\u0026gt; Self { let head = 0usize; let tail = 1usize; let nodes = vec![ Node { key: 0, val: 0, prev: head, next: tail, }, Node { key: 0, val: 0, prev: head, next: tail, }, ]; let mut c = Self { cap: capacity.max(0) as usize, len: 0, map: HashMap::new(), nodes, free: Vec::new(), head, tail, }; c.nodes[c.head].next = c.tail; c.nodes[c.tail].prev = c.head; c } fn detach(\u0026amp;mut self, idx: usize) { let p = self.nodes[idx].prev; let n = self.nodes[idx].next; self.nodes[p].next = n; self.nodes[n].prev = p; } fn insert_front(\u0026amp;mut self, idx: usize) { let first = self.nodes[self.head].next; self.nodes[idx].prev = self.head; self.nodes[idx].next = first; self.nodes[self.head].next = idx; self.nodes[first].prev = idx; } fn move_front(\u0026amp;mut self, idx: usize) { self.detach(idx); self.insert_front(idx); } fn pop_lru(\u0026amp;mut self) -\u0026gt; Option\u0026lt;usize\u0026gt; { let idx = self.nodes[self.tail].prev; if idx == self.head { return None; } self.detach(idx); Some(idx) } fn alloc_node(\u0026amp;mut self, key: i32, val: i32) -\u0026gt; usize { if let Some(idx) = self.free.pop() { self.nodes[idx] = Node { key, val, prev: self.head, next: self.tail, }; idx } else { self.nodes.push(Node { key, val, prev: self.head, next: self.tail, }); self.nodes.len() - 1 } } fn get(\u0026amp;mut self, key: i32) -\u0026gt; i32 { let idx = match self.map.get(\u0026amp;key) { Some(\u0026amp;i) =\u0026gt; i, None =\u0026gt; return -1, }; self.move_front(idx); self.nodes[idx].val } fn put(\u0026amp;mut self, key: i32, value: i32) { if self.cap == 0 { return; } if let Some(\u0026amp;idx) = self.map.get(\u0026amp;key) { self.nodes[idx].val = value; self.move_front(idx); return; } if self.len == self.cap { if let Some(old_idx) = self.pop_lru() { let old_key = self.nodes[old_idx].key; self.map.remove(\u0026amp;old_key); self.free.push(old_idx); self.len -= 1; } } let idx = self.alloc_node(key, value); self.insert_front(idx); self.map.insert(key, idx); self.len += 1; } } fn main() { let mut c = LRUCache::new(2); c.put(1, 1); c.put(2, 2); println!(\u0026#34;{}\u0026#34;, c.get(1)); // 1 c.put(3, 3); println!(\u0026#34;{}\u0026#34;, c.get(2)); // -1 c.put(4, 4); println!(\u0026#34;{} {} {}\u0026#34;, c.get(1), c.get(3), c.get(4)); // -1 3 4 } JavaScript class Node { constructor(key = 0, value = 0) { this.key = key; this.value = value; this.prev = null; this.next = null; } } class LRUCache { constructor(capacity) { this.cap = capacity; this.map = new Map(); this.head = new Node(); this.tail = new Node(); this.head.next = this.tail; this.tail.prev = this.head; } _remove(node) { node.prev.next = node.next; node.next.prev = node.prev; } _addFront(node) { node.prev = this.head; node.next = this.head.next; this.head.next.prev = node; this.head.next = node; } _moveFront(node) { this._remove(node); this._addFront(node); } _popLRU() { const node = this.tail.prev; this._remove(node); return node; } get(key) { if (!this.map.has(key)) return -1; const node = this.map.get(key); this._moveFront(node); return node.value; } put(key, value) { if (this.cap === 0) return; if (this.map.has(key)) { const node = this.map.get(key); node.value = value; this._moveFront(node); return; } if (this.map.size === this.cap) { const old = this._popLRU(); this.map.delete(old.key); } const node = new Node(key, value); this.map.set(key, node); this._addFront(node); } } const c = new LRUCache(2); c.put(1, 1); c.put(2, 2); console.log(c.get(1)); // 1 c.put(3, 3); console.log(c.get(2)); // -1 c.put(4, 4); console.log(c.get(1), c.get(3), c.get(4)); // -1 3 4 CTA Run these three drills now:\nRe-implement remove / add_front / pop_back without looking at the answer. Stress-test operation sequences: repeated put on same key, capacity 1, capacity 0. Solve LeetCode 460 (LFU) and compare structure complexity with LRU. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/146-lru-cache/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis is not a memorization question. It is core cache-engineering practice: satisfy fast lookup and least-recently-used eviction at the same time, both in constant average time. We derive the optimal structure from naive approaches and provide runnable implementations.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 14-18 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eLRU\u003c/code\u003e, \u003ccode\u003ehash map\u003c/code\u003e, \u003ccode\u003edoubly linked list\u003c/code\u003e, \u003ccode\u003esystem design\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: LRU Cache, LeetCode 146, hash map, doubly linked list, O(1)\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Build an LRU cache with hash map + doubly linked list to achieve O(1) average \u003ccode\u003eget/put\u003c/code\u003e, with engineering use cases, pitfalls, and six-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLeetCode learners who want to master data-structure composition\u003c/li\u003e\n\u003cli\u003eBackend/middleware engineers implementing local caches\u003c/li\u003e\n\u003cli\u003eInterview candidates who know the answer headline but not the invariants\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eCaching trades space for time, but cache space is limited.\u003cbr\u003e\nWhen full, we must evict keys. LRU (Least Recently Used) assumes:\u003c/p\u003e","title":"LeetCode 146: LRU Cache Design with O(1) Hash Map + Doubly Linked List"},{"content":" Subtitle / Summary\nThe hard part is not deletion itself, but locating the predecessor of the nth node from the end in a singly linked list. This article derives the one-pass two-pointer solution from simpler baselines and explains correctness, boundaries, and engineering transfer.\nReading time: 12-15 min Tags: linked list, two pointers, interview high frequency SEO keywords: LeetCode 19, Remove Nth Node From End of List, linked list, fast/slow pointers, dummy node Meta description: A complete ACERS walkthrough for removing the nth node from the end: from brute force to one-pass two pointers, with complexity, pitfalls, engineering scenarios, and Python/C/C++/Go/Rust/JS implementations. Target Readers Beginners building a stable template for linked-list interview problems Developers who know fast/slow pointers but still make boundary mistakes Backend/system engineers who want to transfer problem-solving templates to chain-structured data in production Background / Motivation \u0026ldquo;Remove the nth node from the end\u0026rdquo; is a classic medium-level linked-list problem. The challenge is usually not the delete operation itself, but:\nSingly linked lists cannot traverse backward from tail; Deleting the head node complicates return handling; Incorrect next rewiring can easily break the list. Once you master this problem, you get a reusable pattern: dummy node + fixed pointer gap, which is useful in many list operations (split, reverse by group, merge variants).\nCore Concepts Singly linked list: each node has only next, so traversal is one-directional. Dummy node: add a virtual node before head to unify head deletion and middle deletion. Fast/slow fixed gap: move fast ahead by n steps first; when fast reaches the end, slow lands at the predecessor of the target node. A - Algorithm (Problem \u0026amp; Algorithm) Problem Restatement Given the head of a linked list, remove the nth node from the end of the list and return its head.\nInput / Output Item Type Meaning head ListNode head of a singly linked list n int nth position from the end return ListNode head after deletion Example 1 Input: head = [1,2,3,4,5], n = 2 Output: [1,2,3,5] Explanation: the 2nd node from the end is 4, so remove it.\nExample 2 Input: head = [1], n = 1 Output: [] Explanation: removing the only node leaves an empty list.\nExample 3 Input: head = [1,2], n = 2 Output: [2] Explanation: the 2nd node from the end is the head node 1.\nPointer-gap diagram dummy -\u0026gt; 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5 -\u0026gt; null After moving fast by n=2: dummy -\u0026gt; 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5 -\u0026gt; null slow fast Move both together until fast reaches the tail: dummy -\u0026gt; 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5 -\u0026gt; null slow fast Now slow.next is the target node (4) C - Concepts (Core Ideas) Derivation: from naive to optimal Naive array conversion\nConvert to array, remove by index, then rebuild list.\nWorks, but uses O(L) extra space. Avoids linked-list strengths instead of using them. Two-pass traversal\nFirst pass gets length L; second pass stops at index L - n - 1.\nTime O(L), space O(1). Still needs two scans; head deletion handling is awkward without dummy. Best approach: one-pass two pointers + dummy\nMove fast forward n steps first. Move fast and slow together until fast.next == null. slow.next is exactly the node to remove. Method category Two pointers Gap maintenance In-place pointer rewiring Correctness intuition Let list length be L. If fast and slow maintain a fixed gap of n nodes:\nWhen fast is at index L - 1 (tail), slow is at index L - n - 1 (predecessor of target). So removing slow.next is exactly removing the nth node from the end.\nPractice Guide / Steps Create dummy and point dummy.next = head. Initialize fast = slow = dummy. Move fast ahead by n steps. While fast.next != null, move both pointers forward. Delete node by rewiring: slow.next = slow.next.next. Return dummy.next. Runnable Python example:\nfrom typing import List, Optional class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def remove_nth_from_end(head: Optional[ListNode], n: int) -\u0026gt; Optional[ListNode]: dummy = ListNode(0, head) fast = slow = dummy for _ in range(n): fast = fast.next while fast.next is not None: fast = fast.next slow = slow.next slow.next = slow.next.next return dummy.next def from_list(nums: List[int]) -\u0026gt; Optional[ListNode]: dummy = ListNode() tail = dummy for x in nums: tail.next = ListNode(x) tail = tail.next return dummy.next def to_list(head: Optional[ListNode]) -\u0026gt; List[int]: out: List[int] = [] while head: out.append(head.val) head = head.next return out if __name__ == \u0026#34;__main__\u0026#34;: print(to_list(remove_nth_from_end(from_list([1, 2, 3, 4, 5]), 2))) # [1,2,3,5] print(to_list(remove_nth_from_end(from_list([1]), 1))) # [] print(to_list(remove_nth_from_end(from_list([1, 2]), 2))) # [2] E - Engineering (Real-world Applications) The transferable idea is: remove the kth element from tail in a single-direction chain.\nScenario 1: retry-trace chain trimming in backend jobs (Go) Background: microservices often keep a singly linked retry trace for task failures.\nWhy it fits: deleting the nth record from tail can reuse the exact fast/slow template.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { ID int Next *Node } func removeNthFromEnd(head *Node, n int) *Node { dummy := \u0026amp;Node{Next: head} fast, slow := dummy, dummy for i := 0; i \u0026lt; n; i++ { fast = fast.Next } for fast.Next != nil { fast = fast.Next slow = slow.Next } slow.Next = slow.Next.Next return dummy.Next } func printList(head *Node) { for p := head; p != nil; p = p.Next { fmt.Printf(\u0026#34;%d \u0026#34;, p.ID) } fmt.Println() } func main() { head := \u0026amp;Node{1, \u0026amp;Node{2, \u0026amp;Node{3, \u0026amp;Node{4, nil}}}} head = removeNthFromEnd(head, 2) printList(head) // 1 2 4 } Scenario 2: free-block chain cleanup in systems code (C) Background: simplified memory managers may keep free blocks in a singly linked list.\nWhy it fits: removing the nth node from the end can be done in one scan with deterministic pointer rewiring.\n#include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct Node { int addr; struct Node* next; }; struct Node* remove_nth_from_end(struct Node* head, int n) { struct Node dummy = {0, head}; struct Node *fast = \u0026amp;dummy, *slow = \u0026amp;dummy; for (int i = 0; i \u0026lt; n; ++i) fast = fast-\u0026gt;next; while (fast-\u0026gt;next) { fast = fast-\u0026gt;next; slow = slow-\u0026gt;next; } struct Node* del = slow-\u0026gt;next; slow-\u0026gt;next = del-\u0026gt;next; free(del); return dummy.next; } int main() { struct Node* n4 = (struct Node*)malloc(sizeof(struct Node)); struct Node* n3 = (struct Node*)malloc(sizeof(struct Node)); struct Node* n2 = (struct Node*)malloc(sizeof(struct Node)); struct Node* n1 = (struct Node*)malloc(sizeof(struct Node)); n1-\u0026gt;addr = 10; n1-\u0026gt;next = n2; n2-\u0026gt;addr = 20; n2-\u0026gt;next = n3; n3-\u0026gt;addr = 30; n3-\u0026gt;next = n4; n4-\u0026gt;addr = 40; n4-\u0026gt;next = NULL; struct Node* head = remove_nth_from_end(n1, 3); for (struct Node* p = head; p; p = p-\u0026gt;next) printf(\u0026#34;%d \u0026#34;, p-\u0026gt;addr); printf(\u0026#34;\\n\u0026#34;); while (head) { struct Node* t = head; head = head-\u0026gt;next; free(t); } return 0; } Scenario 3: undo-chain compaction in frontend editor state (JavaScript) Background: an editor can model undo history as a singly linked chain.\nWhy it fits: deleting the nth snapshot from tail uses the same list primitive as this problem.\nclass Node { constructor(v, next = null) { this.v = v; this.next = next; } } function removeNthFromEnd(head, n) { const dummy = new Node(0, head); let fast = dummy; let slow = dummy; for (let i = 0; i \u0026lt; n; i++) fast = fast.next; while (fast.next !== null) { fast = fast.next; slow = slow.next; } slow.next = slow.next.next; return dummy.next; } function print(head) { const arr = []; for (let p = head; p; p = p.next) arr.push(p.v); console.log(arr); } const head = new Node(1, new Node(2, new Node(3, new Node(4)))); print(removeNthFromEnd(head, 1)); // [1,2,3] R - Reflection (Deep Dive) Complexity Time: O(L), where L is list length Space: O(1) extra space (in-place pointer update) Approach comparison Approach Time Space Pros Cons Array conversion O(L) O(L) intuitive high extra space, less list-native Two-pass traversal O(L) O(1) stable and simple two scans One-pass + dummy O(L) O(1) single scan, unified boundaries requires gap invariant discipline Common mistakes Forgetting dummy node, which explodes branches for head deletion Off-by-one errors by moving fast n+1 or n-1 steps Forgetting to free removed node in C/C++ contexts Why this method is production-friendly Template-like and reusable across many linked-list variants Stable boundary behavior, especially for removing the head Efficient in performance-sensitive systems (O(1) extra memory) FAQ Q1: Why use while fast.next != null instead of while fast != null? Because we need slow to stop at the predecessor of the target node. Stopping when fast is exactly at the tail gives the correct predecessor position.\nQ2: What if n equals the list length? It still works. With dummy, slow stays at dummy and we remove the original head safely.\nQ3: Can this be written recursively? Yes, but recursion adds O(L) call-stack space. Iteration is generally more stable for long lists.\nBest Practices Always create dummy first, then implement deletion logic Standardize on \u0026ldquo;move fast by n first\u0026rdquo; to reduce off-by-one bugs Teach with two-pass baseline first, then optimize to one-pass In C/C++, explicitly release the removed node S - Summary Key takeaways \u0026ldquo;nth from end\u0026rdquo; can be transformed into a fixed-gap two-pointer problem. Dummy node is the safest boundary tool for linked-list deletion. One-pass fast/slow gives a strong engineering balance: O(L) time, O(1) extra space. This template transfers directly to many chain-structure rewiring tasks. Many medium problems are robust combinations of small stable templates. Recommended reading LeetCode 19 (official): https://leetcode.com/problems/remove-nth-node-from-end-of-list/ LeetCode CN: https://leetcode.cn/problems/remove-nth-node-from-end-of-list/ Related: LeetCode 21, LeetCode 206, LeetCode 25 CTA Rewrite this from memory now:\nImplement the two-pass version. Refactor it into one-pass fast/slow. Verify with three edge cases: n=1, n=len, and single-node list. You will noticeably improve linked-list reliability in interviews and real code.\nRunnable Multi-language Implementations Python from typing import Optional, List class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def remove_nth_from_end(head: Optional[ListNode], n: int) -\u0026gt; Optional[ListNode]: dummy = ListNode(0, head) fast = slow = dummy for _ in range(n): fast = fast.next while fast.next is not None: fast = fast.next slow = slow.next slow.next = slow.next.next return dummy.next def from_list(nums: List[int]) -\u0026gt; Optional[ListNode]: dummy = ListNode() cur = dummy for x in nums: cur.next = ListNode(x) cur = cur.next return dummy.next def to_list(head: Optional[ListNode]) -\u0026gt; List[int]: out = [] while head: out.append(head.val) head = head.next return out if __name__ == \u0026#34;__main__\u0026#34;: h = from_list([1, 2, 3, 4, 5]) print(to_list(remove_nth_from_end(h, 2))) # [1, 2, 3, 5] C #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct ListNode { int val; struct ListNode* next; }; struct ListNode* new_node(int v) { struct ListNode* p = (struct ListNode*)malloc(sizeof(struct ListNode)); p-\u0026gt;val = v; p-\u0026gt;next = NULL; return p; } struct ListNode* removeNthFromEnd(struct ListNode* head, int n) { struct ListNode dummy = {0, head}; struct ListNode *fast = \u0026amp;dummy, *slow = \u0026amp;dummy; for (int i = 0; i \u0026lt; n; ++i) fast = fast-\u0026gt;next; while (fast-\u0026gt;next) { fast = fast-\u0026gt;next; slow = slow-\u0026gt;next; } struct ListNode* del = slow-\u0026gt;next; slow-\u0026gt;next = del-\u0026gt;next; free(del); return dummy.next; } void print_list(struct ListNode* head) { for (struct ListNode* p = head; p; p = p-\u0026gt;next) printf(\u0026#34;%d \u0026#34;, p-\u0026gt;val); printf(\u0026#34;\\n\u0026#34;); } void free_list(struct ListNode* head) { while (head) { struct ListNode* t = head; head = head-\u0026gt;next; free(t); } } int main() { struct ListNode* h1 = new_node(1); h1-\u0026gt;next = new_node(2); h1-\u0026gt;next-\u0026gt;next = new_node(3); h1-\u0026gt;next-\u0026gt;next-\u0026gt;next = new_node(4); h1-\u0026gt;next-\u0026gt;next-\u0026gt;next-\u0026gt;next = new_node(5); h1 = removeNthFromEnd(h1, 2); print_list(h1); // 1 2 3 5 free_list(h1); return 0; } C++ #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; using namespace std; struct ListNode { int val; ListNode* next; ListNode(int x) : val(x), next(nullptr) {} }; ListNode* removeNthFromEnd(ListNode* head, int n) { ListNode dummy(0); dummy.next = head; ListNode* fast = \u0026amp;dummy; ListNode* slow = \u0026amp;dummy; for (int i = 0; i \u0026lt; n; ++i) fast = fast-\u0026gt;next; while (fast-\u0026gt;next != nullptr) { fast = fast-\u0026gt;next; slow = slow-\u0026gt;next; } ListNode* del = slow-\u0026gt;next; slow-\u0026gt;next = del-\u0026gt;next; delete del; return dummy.next; } ListNode* build(const vector\u0026lt;int\u0026gt;\u0026amp; a) { ListNode dummy(0); ListNode* tail = \u0026amp;dummy; for (int x : a) { tail-\u0026gt;next = new ListNode(x); tail = tail-\u0026gt;next; } return dummy.next; } void print(ListNode* head) { for (ListNode* p = head; p; p = p-\u0026gt;next) cout \u0026lt;\u0026lt; p-\u0026gt;val \u0026lt;\u0026lt; \u0026#34; \u0026#34;; cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } void destroy(ListNode* head) { while (head) { ListNode* t = head; head = head-\u0026gt;next; delete t; } } int main() { ListNode* h = build({1, 2, 3, 4, 5}); h = removeNthFromEnd(h, 2); print(h); // 1 2 3 5 destroy(h); return 0; } Go package main import \u0026#34;fmt\u0026#34; type ListNode struct { Val int Next *ListNode } func removeNthFromEnd(head *ListNode, n int) *ListNode { dummy := \u0026amp;ListNode{Next: head} fast, slow := dummy, dummy for i := 0; i \u0026lt; n; i++ { fast = fast.Next } for fast.Next != nil { fast = fast.Next slow = slow.Next } slow.Next = slow.Next.Next return dummy.Next } func build(nums []int) *ListNode { dummy := \u0026amp;ListNode{} tail := dummy for _, x := range nums { tail.Next = \u0026amp;ListNode{Val: x} tail = tail.Next } return dummy.Next } func printList(head *ListNode) { for p := head; p != nil; p = p.Next { fmt.Printf(\u0026#34;%d \u0026#34;, p.Val) } fmt.Println() } func main() { head := build([]int{1, 2, 3, 4, 5}) head = removeNthFromEnd(head, 2) printList(head) // 1 2 3 5 } Rust (safe runnable two-pass variant) To keep ownership handling concise and safe in Rust, this implementation uses a two-pass traversal while preserving O(L) time and O(1) extra space.\n#[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] fn new(val: i32) -\u0026gt; Self { ListNode { next: None, val } } } fn remove_nth_from_end(head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, n: i32) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let mut len = 0usize; let mut p = head.as_ref(); while let Some(node) = p { len += 1; p = node.next.as_ref(); } let idx = len - n as usize; let mut dummy = Box::new(ListNode { val: 0, next: head }); let mut cur = \u0026amp;mut dummy; for _ in 0..idx { cur = cur.next.as_mut().unwrap(); } let next = cur.next.as_mut().and_then(|node| node.next.take()); cur.next = next; dummy.next } fn from_vec(a: Vec\u0026lt;i32\u0026gt;) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let mut head = None; for \u0026amp;x in a.iter().rev() { let mut node = Box::new(ListNode::new(x)); node.next = head; head = Some(node); } head } fn to_vec(mut head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;) -\u0026gt; Vec\u0026lt;i32\u0026gt; { let mut out = Vec::new(); while let Some(mut node) = head { out.push(node.val); head = node.next.take(); } out } fn main() { let head = from_vec(vec![1, 2, 3, 4, 5]); let ans = remove_nth_from_end(head, 2); println!(\u0026#34;{:?}\u0026#34;, to_vec(ans)); // [1, 2, 3, 5] } JavaScript class ListNode { constructor(val = 0, next = null) { this.val = val; this.next = next; } } function removeNthFromEnd(head, n) { const dummy = new ListNode(0, head); let fast = dummy; let slow = dummy; for (let i = 0; i \u0026lt; n; i++) { fast = fast.next; } while (fast.next !== null) { fast = fast.next; slow = slow.next; } slow.next = slow.next.next; return dummy.next; } function fromArray(arr) { const dummy = new ListNode(); let tail = dummy; for (const x of arr) { tail.next = new ListNode(x); tail = tail.next; } return dummy.next; } function toArray(head) { const out = []; for (let p = head; p; p = p.next) out.push(p.val); return out; } const head = fromArray([1, 2, 3, 4, 5]); console.log(toArray(removeNthFromEnd(head, 2))); // [1,2,3,5] ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/19-remove-nth-node-from-end-of-list/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThe hard part is not deletion itself, but locating the predecessor of the nth node from the end in a singly linked list. This article derives the one-pass two-pointer solution from simpler baselines and explains correctness, boundaries, and engineering transfer.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003etwo pointers\u003c/code\u003e, \u003ccode\u003einterview high frequency\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: LeetCode 19, Remove Nth Node From End of List, linked list, fast/slow pointers, dummy node\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A complete ACERS walkthrough for removing the nth node from the end: from brute force to one-pass two pointers, with complexity, pitfalls, engineering scenarios, and Python/C/C++/Go/Rust/JS implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eBeginners building a stable template for linked-list interview problems\u003c/li\u003e\n\u003cli\u003eDevelopers who know fast/slow pointers but still make boundary mistakes\u003c/li\u003e\n\u003cli\u003eBackend/system engineers who want to transfer problem-solving templates to chain-structured data in production\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003e\u0026ldquo;Remove the nth node from the end\u0026rdquo; is a classic medium-level linked-list problem. The challenge is usually not the delete operation itself, but:\u003c/p\u003e","title":"LeetCode 19: Remove Nth Node From End of List (One-pass Two Pointers) ACERS Guide"},{"content":" Subtitle / Abstract\nThe real challenge in this problem is not traversing the list, but correctly cloning the cross-node reference relationships created by random pointers. This article moves from naive intuition to a hash-mapping solution, and explains why it is stable, maintainable, and practical in real engineering.\nEstimated reading time: 12–16 minutes Tags: Linked List, Deep Copy, Hash Table, Random Pointer SEO keywords: LeetCode 138, Copy List with Random Pointer, random list copy, deep copy, hash mapping Meta description: Perform deep copy of a random-pointer linked list via two passes plus a mapping table, with correctness, complexity, engineering practice, and six-language implementations. Target Readers Developers who feel shaky on random pointer problems while practicing LeetCode Learners who want to clearly understand \u0026ldquo;shallow copy vs deep copy\u0026rdquo; Engineers who want to transfer algorithmic thinking to real object-copy scenarios Background / Motivation For a normal linked list, copying val and next is straightforward.\nA random-pointer list adds one more pointer, random, which can:\nPoint to any node (earlier node, later node, or itself) Or be null That turns the problem from \u0026ldquo;linear copy\u0026rdquo; into \u0026ldquo;structure copy with extra references.\u0026rdquo;\nCommon engineering equivalents include:\nCopying workflow node objects while preserving cross-step jump relationships Copying cached object graphs while keeping internal references consistent Copying session chains while preserving backtracking / shortcut references Core Concepts Shallow Copy: copies only the node shell; internal references still point to old objects Deep Copy: rebuilds a full object graph; all references point to new objects Node Identity Mapping: old_node -\u0026gt; new_node, the key to rebuilding random Structural Equivalence: the new list is isomorphic to the old one in values and pointer relations, while sharing no nodes A — Algorithm (Problem and Algorithm) Problem Restatement Given a linked list of length n, each node has:\nval next random (can point to any node or null) Construct a deep copy of this list and return the new head node.\nNo pointer in the new list may point to any node in the original list.\nInput / Output Representation The problem statement often uses [val, random_index] to represent each node:\nval: node value random_index: index of the node pointed to by random; null if empty Your function input is only head, and your output is the copied list head.\nExample 1 Input: [[7,null],[13,0],[11,4],[10,2],[1,0]] Output: [[7,null],[13,0],[11,4],[10,2],[1,0]] Explanation: The output has the same value/reference structure as the input, but all nodes are newly created objects. Example 2 Input: [[1,1],[2,1]] Output: [[1,1],[2,1]] Explanation: The first node\u0026#39;s random points to the second node, and the second node\u0026#39;s random points to itself. Thought Process: From Naive to Maintainable Solution Naive Pitfall: handling random immediately during traversal If you try to set new.random when first visiting a node, you hit this issue:\nThe target node of random may not have been copied yet You need repeated backfilling, which increases branching complexity and risks missing edge cases Key Observation random cannot be rebuilt correctly without a node-identity mapping.\nOnce old -\u0026gt; new mapping exists, all pointer reconstruction becomes simple lookup operations.\nMethod Selection: two passes + hash mapping First pass: copy node values and build mapping map[old] = new Second pass: rebuild next and random from that mapping Advantages of this approach:\nIntuitive and easy to debug Easy to prove correctness Maintainable in both interviews and production code C — Concepts (Core Ideas) Algorithm Classification Linked-list traversal Hash mapping (object identity mapping) Graph copy (special graph: each node has at most two outgoing edges) Conceptual Model Treat the list as a directed graph:\nNode set: V Edge set: E = {next edges, random edges} The copy target is an isomorphic graph G', satisfying:\nval(v') = val(v) f(next(v)) = next(f(v)) f(random(v)) = random(f(v)) where f is the mapping old -\u0026gt; new.\nCorrectness Highlights (Brief) After pass one, each old node u has a unique copied node f(u) In pass two, for each edge u -\u0026gt; v, set f(u).ptr = f(v) (v may be null) Because each next/random edge is rewired via f, the copied structure is fully equivalent and contains no leaked references to old nodes Practical Guide / Steps Handle empty list first: if head == null, return null First pass: create a copied node for every old node and store it in mapping Second pass: set next and random for each copied node Return map[head] Runnable Python example:\nfrom typing import Optional, List class Node: def __init__(self, x: int, next: Optional[\u0026#34;Node\u0026#34;] = None, random: Optional[\u0026#34;Node\u0026#34;] = None): self.val = x self.next = next self.random = random def copy_random_list(head: Optional[Node]) -\u0026gt; Optional[Node]: if head is None: return None mp = {} cur = head while cur is not None: mp[cur] = Node(cur.val) cur = cur.next cur = head while cur is not None: mp[cur].next = mp.get(cur.next) mp[cur].random = mp.get(cur.random) cur = cur.next return mp[head] def build(arr: List[List[Optional[int]]]) -\u0026gt; Optional[Node]: if not arr: return None nodes = [Node(v) for v, _ in arr] for i in range(len(nodes) - 1): nodes[i].next = nodes[i + 1] for i, (_, r) in enumerate(arr): nodes[i].random = nodes[r] if r is not None else None return nodes[0] def dump(head: Optional[Node]) -\u0026gt; List[List[Optional[int]]]: out = [] idx = {} cur, i = head, 0 while cur is not None: idx[cur] = i cur = cur.next i += 1 cur = head while cur is not None: out.append([cur.val, idx.get(cur.random)]) cur = cur.next return out if __name__ == \u0026#34;__main__\u0026#34;: data = [[7, None], [13, 0], [11, 4], [10, 2], [1, 0]] src = build(data) cp = copy_random_list(src) print(dump(cp)) Code / Test Cases / Test Results Code Highlights Two passes: create nodes in pass one, connect edges in pass two map.get(None) == None (Python) reduces explicit null-check branches Test Cases Case 1: [] Expected: [] Case 2: [[1,null]] Expected: [[1,null]] Case 3: [[1,0]] Expected: [[1,0]] (self-pointing random) Case 4: [[7,null],[13,0],[11,4],[10,2],[1,0]] Expected: same structure after copy Test Results (Sample) All tests passed: structure is equivalent, and node addresses in the copied list are completely different from the original list. E — Engineering (Engineering Applications) Scenario 1: Deep copy of workflow definitions (Python) Background: workflow nodes have sequential next, and may also include jump references (similar to random).\nWhy it fits: when cloning a template into a new workflow, jump relationships must be preserved without contaminating the original template.\nclass Step: def __init__(self, name): self.name = name self.next = None self.jump = None def copy_steps(head): if not head: return None mp = {} cur = head while cur: mp[cur] = Step(cur.name) cur = cur.next cur = head while cur: mp[cur].next = mp.get(cur.next) mp[cur].jump = mp.get(cur.jump) cur = cur.next return mp[head] Scenario 2: Backend task-chain copy (Go) Background: task nodes execute linearly, but can jump back to a compensation node on failure.\nWhy it fits: failure-jump relationships are fundamentally random references and must be reconstructed during copying.\npackage main import \u0026#34;fmt\u0026#34; type Task struct { Name string Next *Task Backup *Task } func copyTasks(head *Task) *Task { if head == nil { return nil } mp := map[*Task]*Task{} for cur := head; cur != nil; cur = cur.Next { mp[cur] = \u0026amp;Task{Name: cur.Name} } for cur := head; cur != nil; cur = cur.Next { mp[cur].Next = mp[cur.Next] mp[cur].Backup = mp[cur.Backup] } return mp[head] } func main() { a := \u0026amp;Task{Name: \u0026#34;A\u0026#34;} b := \u0026amp;Task{Name: \u0026#34;B\u0026#34;} a.Next = b b.Backup = b cp := copyTasks(a) fmt.Println(cp.Name, cp.Next.Name, cp.Next.Backup == cp.Next) // A B true } Scenario 3: Frontend editor-history chain copy (JavaScript) Background: editor history usually has a linear chain plus references for quick-jump key versions.\nWhy it fits: when switching user sessions, copying history chains avoids cross-session object-reference contamination.\nclass Version { constructor(id) { this.id = id; this.next = null; this.jump = null; } } function copyVersions(head) { if (!head) return null; const mp = new Map(); for (let cur = head; cur; cur = cur.next) mp.set(cur, new Version(cur.id)); for (let cur = head; cur; cur = cur.next) { mp.get(cur).next = mp.get(cur.next) || null; mp.get(cur).jump = mp.get(cur.jump) || null; } return mp.get(head); } R — Reflection (Reflection and Depth) Complexity Analysis Time complexity: O(n) (two linear passes) Space complexity: O(n) (mapping table) Alternative Approach Comparison Approach Time Extra Space Evaluation Two-pass hash mapping (this article) O(n) O(n) Easiest to write, most stable, high maintainability Interleaving-list method (insert copies in-place, then split) O(n) O(1) Better space usage, but more implementation details Serialize + deserialize Usually \u0026gt; O(n) Depends on format Possible in engineering, but not ideal for core interview evaluation Common Incorrect Approaches Copying only val/next and forgetting random Accidentally pointing copied random back to original nodes Using old-node pointers directly in second-pass rewiring instead of mapped new nodes Forgetting to handle head == null Why this method is more practical in engineering Clear logical layering (node creation and edge rewiring are separated) Easy to debug (check mapping scale first, then pointer connections) Team-friendly and easier for newcomers to maintain quickly Frequently Asked Questions and Notes (FAQ) Q1: Why can this be viewed as a graph-copy problem? Because each node has two edge types, next and random; what we copy is the full node-edge relationship, not just linear list order.\nQ2: Can it be done in one pass? It is possible in theory, but code complexity and bug risk rise significantly. In interviews and engineering, the two-pass hash-mapping version is recommended.\nQ3: Is a hash table required? Not strictly required. If you pursue O(1) extra space, you can use the interleaving-list method, but readability is usually worse than hash mapping.\nBest Practices and Recommendations Separate \u0026ldquo;copy nodes\u0026rdquo; and \u0026ldquo;rebuild pointers\u0026rdquo; into two phases to avoid state confusion Use node object identity as mapping key, not node values Cover these regression cases: empty list, self-pointing random, cross-pointing random, tail node with random = null For debugging output, [val, random_index] is usually more intuitive than raw addresses S — Summary (Summary) Key takeaways:\nThis problem is essentially \u0026ldquo;object identity mapping + pointer rewiring,\u0026rdquo; not ordinary linear list copy. The two-pass approach splits the problem into node creation and edge rewiring, improving both correctness and maintainability. Correct random reconstruction depends on a complete old -\u0026gt; new mapping. Hash mapping is an extremely stable engineering baseline and the clearest way to explain the solution in interviews. Once understood, this pattern transfers naturally to graph copy, workflow clone, and object-graph duplication scenarios. Recommended follow-up reading:\nLeetCode 133 Clone Graph LeetCode 146 LRU Cache (hash mapping + linked-list coordination) LeetCode 21 / 206 (fundamental linked-list drills) Designing Data-Intensive Applications sections on object relationships and data copying Multi-language Runnable Implementations Python from typing import Optional class Node: def __init__(self, x: int, next: Optional[\u0026#34;Node\u0026#34;] = None, random: Optional[\u0026#34;Node\u0026#34;] = None): self.val = x self.next = next self.random = random class Solution: def copyRandomList(self, head: Optional[Node]) -\u0026gt; Optional[Node]: if head is None: return None mp = {} cur = head while cur is not None: mp[cur] = Node(cur.val) cur = cur.next cur = head while cur is not None: mp[cur].next = mp.get(cur.next) mp[cur].random = mp.get(cur.random) cur = cur.next return mp[head] C #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct Node { int val; struct Node* next; struct Node* random; }; struct Node* new_node(int v) { struct Node* n = (struct Node*)malloc(sizeof(struct Node)); n-\u0026gt;val = v; n-\u0026gt;next = NULL; n-\u0026gt;random = NULL; return n; } // Interleaving-list method: O(n) time, O(1) extra space struct Node* copyRandomList(struct Node* head) { if (head == NULL) return NULL; struct Node* cur = head; while (cur != NULL) { struct Node* cp = new_node(cur-\u0026gt;val); cp-\u0026gt;next = cur-\u0026gt;next; cur-\u0026gt;next = cp; cur = cp-\u0026gt;next; } cur = head; while (cur != NULL) { struct Node* cp = cur-\u0026gt;next; cp-\u0026gt;random = (cur-\u0026gt;random != NULL) ? cur-\u0026gt;random-\u0026gt;next : NULL; cur = cp-\u0026gt;next; } struct Node* new_head = head-\u0026gt;next; cur = head; while (cur != NULL) { struct Node* cp = cur-\u0026gt;next; cur-\u0026gt;next = cp-\u0026gt;next; cp-\u0026gt;next = (cp-\u0026gt;next != NULL) ? cp-\u0026gt;next-\u0026gt;next : NULL; cur = cur-\u0026gt;next; } return new_head; } void print_list(struct Node* head) { struct Node* arr[128]; int n = 0; for (struct Node* p = head; p != NULL; p = p-\u0026gt;next) arr[n++] = p; for (int i = 0; i \u0026lt; n; i++) { int r = -1; for (int j = 0; j \u0026lt; n; j++) { if (arr[i]-\u0026gt;random == arr[j]) { r = j; break; } } if (r \u0026gt;= 0) printf(\u0026#34;[%d,%d] \u0026#34;, arr[i]-\u0026gt;val, r); else printf(\u0026#34;[%d,null] \u0026#34;, arr[i]-\u0026gt;val); } printf(\u0026#34;\\n\u0026#34;); } int main(void) { struct Node* a = new_node(1); struct Node* b = new_node(2); a-\u0026gt;next = b; a-\u0026gt;random = b; b-\u0026gt;random = b; struct Node* cp = copyRandomList(a); print_list(cp); // [1,1] [2,1] return 0; } C++ #include \u0026lt;iostream\u0026gt; #include \u0026lt;unordered_map\u0026gt; using namespace std; class Node { public: int val; Node* next; Node* random; Node(int _val) : val(_val), next(nullptr), random(nullptr) {} }; class Solution { public: Node* copyRandomList(Node* head) { if (!head) return nullptr; unordered_map\u0026lt;Node*, Node*\u0026gt; mp; for (Node* cur = head; cur; cur = cur-\u0026gt;next) { mp[cur] = new Node(cur-\u0026gt;val); } for (Node* cur = head; cur; cur = cur-\u0026gt;next) { mp[cur]-\u0026gt;next = cur-\u0026gt;next ? mp[cur-\u0026gt;next] : nullptr; mp[cur]-\u0026gt;random = cur-\u0026gt;random ? mp[cur-\u0026gt;random] : nullptr; } return mp[head]; } }; Go package main type Node struct { Val int Next *Node Random *Node } func copyRandomList(head *Node) *Node { if head == nil { return nil } mp := map[*Node]*Node{} for cur := head; cur != nil; cur = cur.Next { mp[cur] = \u0026amp;Node{Val: cur.Val} } for cur := head; cur != nil; cur = cur.Next { mp[cur].Next = mp[cur.Next] mp[cur].Random = mp[cur.Random] } return mp[head] } Rust use std::cell::RefCell; use std::collections::HashMap; use std::rc::Rc; #[derive(Debug)] struct Node { val: i32, next: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt;, random: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt;, } impl Node { fn new(val: i32) -\u0026gt; Self { Self { val, next: None, random: None } } } fn copy_random_list(head: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt;) -\u0026gt; Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt; { let start = head.clone()?; let mut mp: HashMap\u0026lt;*const RefCell\u0026lt;Node\u0026gt;, Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt; = HashMap::new(); let mut cur = head.clone(); while let Some(node_rc) = cur { let ptr = Rc::as_ptr(\u0026amp;node_rc); let val = node_rc.borrow().val; mp.insert(ptr, Rc::new(RefCell::new(Node::new(val)))); cur = node_rc.borrow().next.clone(); } cur = head; while let Some(node_rc) = cur { let old_ptr = Rc::as_ptr(\u0026amp;node_rc); let new_node = mp.get(\u0026amp;old_ptr).unwrap().clone(); let next_old = node_rc.borrow().next.clone(); let random_old = node_rc.borrow().random.clone(); { let mut nm = new_node.borrow_mut(); nm.next = next_old .as_ref() .and_then(|x| mp.get(\u0026amp;Rc::as_ptr(x)).cloned()); nm.random = random_old .as_ref() .and_then(|x| mp.get(\u0026amp;Rc::as_ptr(x)).cloned()); } cur = next_old; } mp.get(\u0026amp;Rc::as_ptr(\u0026amp;start)).cloned() } fn main() { let n1 = Rc::new(RefCell::new(Node::new(1))); let n2 = Rc::new(RefCell::new(Node::new(2))); n1.borrow_mut().next = Some(n2.clone()); n1.borrow_mut().random = Some(n2.clone()); n2.borrow_mut().random = Some(n2.clone()); let cp = copy_random_list(Some(n1)).unwrap(); println!(\u0026#34;{}\u0026#34;, cp.borrow().val); // 1 } JavaScript function Node(val, next = null, random = null) { this.val = val; this.next = next; this.random = random; } function copyRandomList(head) { if (head === null) return null; const mp = new Map(); for (let cur = head; cur !== null; cur = cur.next) { mp.set(cur, new Node(cur.val)); } for (let cur = head; cur !== null; cur = cur.next) { mp.get(cur).next = cur.next ? mp.get(cur.next) : null; mp.get(cur).random = cur.random ? mp.get(cur.random) : null; } return mp.get(head); } Call to Action (CTA) I recommend doing these two reinforcement steps right now:\nWrite the two-pass hash-mapping solution once from memory and pass your own tests. Then tackle LeetCode 133 Clone Graph to transfer identity-mapping copy logic to a more general graph structure. If you want, I can write the next article on LeetCode 146 LRU Cache, extending from \u0026ldquo;hash + linked list\u0026rdquo; in copy problems to cache-eviction design.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/linked-list/138-copy-list-with-random-pointer/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nThe real challenge in this problem is not traversing the list, but correctly cloning the cross-node reference relationships created by \u003ccode\u003erandom\u003c/code\u003e pointers. This article moves from naive intuition to a hash-mapping solution, and explains why it is stable, maintainable, and practical in real engineering.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 12–16 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eLinked List\u003c/code\u003e, \u003ccode\u003eDeep Copy\u003c/code\u003e, \u003ccode\u003eHash Table\u003c/code\u003e, \u003ccode\u003eRandom Pointer\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: LeetCode 138, Copy List with Random Pointer, random list copy, deep copy, hash mapping\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Perform deep copy of a random-pointer linked list via two passes plus a mapping table, with correctness, complexity, engineering practice, and six-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eDevelopers who feel shaky on \u003ccode\u003erandom\u003c/code\u003e pointer problems while practicing LeetCode\u003c/li\u003e\n\u003cli\u003eLearners who want to clearly understand \u0026ldquo;shallow copy vs deep copy\u0026rdquo;\u003c/li\u003e\n\u003cli\u003eEngineers who want to transfer algorithmic thinking to real object-copy scenarios\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eFor a normal linked list, copying \u003ccode\u003eval\u003c/code\u003e and \u003ccode\u003enext\u003c/code\u003e is straightforward.\u003cbr\u003e\nA random-pointer list adds one more pointer, \u003ccode\u003erandom\u003c/code\u003e, which can:\u003c/p\u003e","title":"LeetCode 138: Copy List with Random Pointer — A Complete Deep-Copy Breakdown"},{"content":" Subtitle / Summary\nThis problem is just grade-school addition on a linked list: add one digit at a time, propagate carry, and append one final node if carry remains after both lists end. We move from naive ideas to the optimal one-pass solution, then map it to real engineering scenarios.\nReading time: 12-15 min Tags: linked list, carry, simulation, LeetCode 2 SEO keywords: Add Two Numbers, LeetCode 2, reverse-order list, carry, dummy node Meta description: Use dummy + tail + carry to sum two reverse-order linked lists in O(max(m,n)) time, with common pitfalls, engineering analogies, and six-language runnable implementations. Target Readers Beginners building a stable template for linked-list problems Intermediate developers who often miss carry or boundary cases Engineers who want to transfer algorithmic thinking to stream-style data processing Background / Motivation This looks like an entry-level LeetCode problem, but it trains practical skills you will reuse:\nSynchronous progression across multiple input streams (l1, l2) Cross-iteration state propagation (carry) Boundary completeness (different lengths, final carry node) These three appear frequently in production systems: chunked amount accumulation, multi-source counter merge, and streaming aggregation with backfill.\nCore Concepts Reverse-order storage: ones digit at the head, then tens, then hundreds\u0026hellip; Digit-wise addition: each round handles only x + y + carry Carry propagation: carry = sum // 10, current digit sum % 10 Dummy node: avoids special handling when creating the result head A — Algorithm (Problem \u0026amp; Algorithm) Problem Restatement You are given two non-empty linked lists representing two non-negative integers.\nDigits are stored in reverse order, and each node stores one digit.\nReturn their sum as a linked list in the same reverse order.\nExcept for number 0, the input numbers do not have leading zeros.\nInput / Output Item Meaning Input Two linked lists l1, l2, each node value in 0~9 Output A new linked list representing l1 + l2 in reverse order Example 1 Input: l1 = [2,4,3], l2 = [5,6,4] Explanation: 342 + 465 = 807 Output: [7,0,8] Example 2 Input: l1 = [9,9,9,9,9,9,9], l2 = [9,9,9,9] Explanation: 9999999 + 9999 = 10009998 Output: [8,9,9,9,0,0,0,1] Derivation: from Naive to Optimal Naive idea 1: convert to integers, add, then rebuild list Convert lists to integers n1, n2 Compute n1 + n2 Split the result back to digits and build a new list Problems:\nMay overflow in many languages for long inputs Extra conversion in both directions Misses the essence of linked-list digit simulation Naive idea 2: convert to arrays first, then add by index Convert both lists to arrays Add digit by digit Problems:\nNeeds O(m+n) extra space Inputs are already low-digit-first, so the array layer is unnecessary Key Observation The lists already start from the ones digit, exactly what column addition needs Each round depends only on current digits and carry One linear pass is enough Method Choice Use dummy + tail to build the output. Loop while:\nwhile l1 != null or l2 != null or carry != 0 Per iteration:\nRead current digits x, y (treat missing node as 0) sum = x + y + carry Append node sum % 10 Update carry = sum // 10 C — Concepts (Core Ideas) Method Category Linked-list simulation Carry state machine Dual-pointer synchronous traversal State Model Let x_k, y_k be the digits at round k, and c_k be carry-in:\ns_k = x_k + y_k + c_k digit_k = s_k mod 10 c_(k+1) = floor(s_k / 10) where c_k ∈ {0,1}.\nThis is the exact mathematical form of decimal column addition.\nCorrectness Intuition digit_k is exactly the k-th digit of the result carry passes the overflow (\u0026gt;9) to the next round If both lists end but carry=1, append one final node Practice Guide / Steps Initialize dummy, tail, and carry = 0 Loop while either list remains or carry is non-zero Read current values: x = l1.val if l1 else 0, y = l2.val if l2 else 0 Compute sum, append sum % 10 Update carry = sum // 10, move tail and input pointers Return dummy.next Minimal runnable Python example:\nfrom typing import Optional, List class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def add_two_numbers(l1: Optional[ListNode], l2: Optional[ListNode]) -\u0026gt; Optional[ListNode]: dummy = ListNode(0) tail = dummy carry = 0 while l1 is not None or l2 is not None or carry: x = l1.val if l1 is not None else 0 y = l2.val if l2 is not None else 0 s = x + y + carry carry = s // 10 tail.next = ListNode(s % 10) tail = tail.next if l1 is not None: l1 = l1.next if l2 is not None: l2 = l2.next return dummy.next def build(nums: List[int]) -\u0026gt; Optional[ListNode]: dummy = ListNode() tail = dummy for n in nums: tail.next = ListNode(n) tail = tail.next return dummy.next def dump(head: Optional[ListNode]) -\u0026gt; List[int]: out: List[int] = [] while head is not None: out.append(head.val) head = head.next return out if __name__ == \u0026#34;__main__\u0026#34;: a = build([2, 4, 3]) b = build([5, 6, 4]) print(dump(add_two_numbers(a, b))) # [7, 0, 8] E — Engineering (Real-world Mapping) Scenario 1: chunked amount merge in finance (Python) Background: some systems store or transmit very large amounts in chunks.\nWhy this maps well: each chunk behaves like one digit group; the core is still same-position add + carry propagation.\ndef add_digits(a, b): i = j = 0 carry = 0 out = [] while i \u0026lt; len(a) or j \u0026lt; len(b) or carry: x = a[i] if i \u0026lt; len(a) else 0 y = b[j] if j \u0026lt; len(b) else 0 s = x + y + carry out.append(s % 10) carry = s // 10 i += 1 j += 1 return out print(add_digits([2, 4, 3], [5, 6, 4])) # [7,0,8] Scenario 2: merge low-digit-first counters from multiple services (Go) Background: two backend services report low-digit-first counter blocks.\nWhy this maps well: digit-wise merge is stream-friendly and memory-stable.\npackage main import \u0026#34;fmt\u0026#34; func addDigits(a, b []int) []int { i, j, carry := 0, 0, 0 out := make([]int, 0) for i \u0026lt; len(a) || j \u0026lt; len(b) || carry \u0026gt; 0 { x, y := 0, 0 if i \u0026lt; len(a) { x = a[i] i++ } if j \u0026lt; len(b) { y = b[j] j++ } s := x + y + carry out = append(out, s%10) carry = s / 10 } return out } func main() { fmt.Println(addDigits([]int{9, 9, 9}, []int{1})) // [0 0 0 1] } Scenario 3: offline draft version increment in frontend (JavaScript) Background: offline editors may split very long version numbers into digits/chunks.\nWhy this maps well: browser-side processing avoids dependency on big-integer libraries.\nfunction addDigits(a, b) { let i = 0; let j = 0; let carry = 0; const out = []; while (i \u0026lt; a.length || j \u0026lt; b.length || carry) { const x = i \u0026lt; a.length ? a[i++] : 0; const y = j \u0026lt; b.length ? b[j++] : 0; const s = x + y + carry; out.push(s % 10); carry = Math.floor(s / 10); } return out; } console.log(addDigits([2, 4, 3], [5, 6, 4])); // [7,0,8] R — Reflection (Deep Dive) Complexity Time: O(max(m, n)) Space: O(max(m, n)) for result list; auxiliary space is O(1) Alternative Comparison Approach Time Extra Space Problem Convert to integers then add O(m+n) depends on big-int overflow risk or big-int dependency Convert to arrays then add O(m+n) O(m+n) unnecessary middle layer One-pass list simulation (this) O(max(m,n)) O(1) auxiliary clear boundaries, production-friendly Common Mistakes Forgetting carry != 0 in the loop condition, so 999 + 1 loses the last digit Dereferencing null when list lengths differ Over-optimizing in-place reuse of input lists and making code branches hard to reason about Why this method is optimal and practical Single pass and direct mapping to decimal addition No dependency on language big-integer support Unified boundary handling, easy to test and port FAQ Q1: Why must the loop condition include carry? Because both lists may end while a carry is still pending. Example: 5 + 5 = 10 still needs one more node 1.\nQ2: Can we modify l1 or l2 in place? Possible, but usually not worth it: branch complexity increases and you may break caller expectations about input reuse. In interviews and production code, building a new result list is cleaner.\nQ3: What if digits are stored in forward order? That is a different problem (LeetCode 445). Typical solutions use stack or recursion from high digit to low digit, unlike this low-digit-first model.\nBest Practices Keep the stable template: dummy + tail + carry Use while l1 or l2 or carry to unify boundaries Always test three cases: same length, different lengths, all-carry chain Keep “missing node means 0” in one place to avoid scattered null checks S — Summary Key takeaways:\nReverse-order list addition is a decimal digit state machine. carry is cross-round state and must be included in loop condition. A dummy node removes fragile head-special-case logic. This is a foundational template for linked-list simulation and boundary management. The same model transfers well to chunked numeric merge and streaming counters. Recommended follow-ups:\nLeetCode 445 Add Two Numbers II (forward-order digits) LeetCode 21 Merge Two Sorted Lists (dual-pointer list template) LeetCode 206 Reverse Linked List (core linked-list operations) CLRS chapters on linked lists and basic data structures Runnable Multi-language Implementations Python from typing import Optional, List class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next class Solution: def addTwoNumbers(self, l1: Optional[ListNode], l2: Optional[ListNode]) -\u0026gt; Optional[ListNode]: dummy = ListNode(0) tail = dummy carry = 0 while l1 is not None or l2 is not None or carry: x = l1.val if l1 else 0 y = l2.val if l2 else 0 s = x + y + carry carry = s // 10 tail.next = ListNode(s % 10) tail = tail.next if l1: l1 = l1.next if l2: l2 = l2.next return dummy.next def build(nums: List[int]) -\u0026gt; Optional[ListNode]: d = ListNode() t = d for v in nums: t.next = ListNode(v) t = t.next return d.next def dump(head: Optional[ListNode]) -\u0026gt; List[int]: out: List[int] = [] while head: out.append(head.val) head = head.next return out if __name__ == \u0026#34;__main__\u0026#34;: ans = Solution().addTwoNumbers(build([2, 4, 3]), build([5, 6, 4])) print(dump(ans)) # [7, 0, 8] C #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct ListNode { int val; struct ListNode* next; }; struct ListNode* new_node(int v) { struct ListNode* n = (struct ListNode*)malloc(sizeof(struct ListNode)); n-\u0026gt;val = v; n-\u0026gt;next = NULL; return n; } struct ListNode* addTwoNumbers(struct ListNode* l1, struct ListNode* l2) { struct ListNode dummy; dummy.val = 0; dummy.next = NULL; struct ListNode* tail = \u0026amp;dummy; int carry = 0; while (l1 != NULL || l2 != NULL || carry != 0) { int x = (l1 != NULL) ? l1-\u0026gt;val : 0; int y = (l2 != NULL) ? l2-\u0026gt;val : 0; int s = x + y + carry; carry = s / 10; tail-\u0026gt;next = new_node(s % 10); tail = tail-\u0026gt;next; if (l1 != NULL) l1 = l1-\u0026gt;next; if (l2 != NULL) l2 = l2-\u0026gt;next; } return dummy.next; } struct ListNode* build(const int* a, int n) { struct ListNode dummy; dummy.next = NULL; struct ListNode* tail = \u0026amp;dummy; for (int i = 0; i \u0026lt; n; i++) { tail-\u0026gt;next = new_node(a[i]); tail = tail-\u0026gt;next; } return dummy.next; } void print_list(struct ListNode* h) { while (h != NULL) { printf(\u0026#34;%d\u0026#34;, h-\u0026gt;val); if (h-\u0026gt;next != NULL) printf(\u0026#34; -\u0026gt; \u0026#34;); h = h-\u0026gt;next; } printf(\u0026#34;\\n\u0026#34;); } void free_list(struct ListNode* h) { while (h != NULL) { struct ListNode* nxt = h-\u0026gt;next; free(h); h = nxt; } } int main(void) { int a[] = {2, 4, 3}; int b[] = {5, 6, 4}; struct ListNode* l1 = build(a, 3); struct ListNode* l2 = build(b, 3); struct ListNode* ans = addTwoNumbers(l1, l2); print_list(ans); // 7 -\u0026gt; 0 -\u0026gt; 8 free_list(l1); free_list(l2); free_list(ans); return 0; } C++ #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; using namespace std; struct ListNode { int val; ListNode* next; ListNode(int x = 0) : val(x), next(nullptr) {} }; class Solution { public: ListNode* addTwoNumbers(ListNode* l1, ListNode* l2) { ListNode dummy(0); ListNode* tail = \u0026amp;dummy; int carry = 0; while (l1 || l2 || carry) { int x = l1 ? l1-\u0026gt;val : 0; int y = l2 ? l2-\u0026gt;val : 0; int s = x + y + carry; carry = s / 10; tail-\u0026gt;next = new ListNode(s % 10); tail = tail-\u0026gt;next; if (l1) l1 = l1-\u0026gt;next; if (l2) l2 = l2-\u0026gt;next; } return dummy.next; } }; ListNode* build(const vector\u0026lt;int\u0026gt;\u0026amp; a) { ListNode dummy; ListNode* tail = \u0026amp;dummy; for (int v : a) { tail-\u0026gt;next = new ListNode(v); tail = tail-\u0026gt;next; } return dummy.next; } void printList(ListNode* h) { while (h) { cout \u0026lt;\u0026lt; h-\u0026gt;val; if (h-\u0026gt;next) cout \u0026lt;\u0026lt; \u0026#34; -\u0026gt; \u0026#34;; h = h-\u0026gt;next; } cout \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; } void freeList(ListNode* h) { while (h) { ListNode* nxt = h-\u0026gt;next; delete h; h = nxt; } } int main() { ListNode* l1 = build({2, 4, 3}); ListNode* l2 = build({5, 6, 4}); ListNode* ans = Solution().addTwoNumbers(l1, l2); printList(ans); // 7 -\u0026gt; 0 -\u0026gt; 8 freeList(l1); freeList(l2); freeList(ans); return 0; } Go package main import \u0026#34;fmt\u0026#34; type ListNode struct { Val int Next *ListNode } func addTwoNumbers(l1 *ListNode, l2 *ListNode) *ListNode { dummy := \u0026amp;ListNode{} tail := dummy carry := 0 for l1 != nil || l2 != nil || carry != 0 { x, y := 0, 0 if l1 != nil { x = l1.Val l1 = l1.Next } if l2 != nil { y = l2.Val l2 = l2.Next } s := x + y + carry carry = s / 10 tail.Next = \u0026amp;ListNode{Val: s % 10} tail = tail.Next } return dummy.Next } func build(a []int) *ListNode { dummy := \u0026amp;ListNode{} tail := dummy for _, v := range a { tail.Next = \u0026amp;ListNode{Val: v} tail = tail.Next } return dummy.Next } func printList(h *ListNode) { for h != nil { fmt.Print(h.Val) if h.Next != nil { fmt.Print(\u0026#34; -\u0026gt; \u0026#34;) } h = h.Next } fmt.Println() } func main() { l1 := build([]int{2, 4, 3}) l2 := build([]int{5, 6, 4}) ans := addTwoNumbers(l1, l2) printList(ans) // 7 -\u0026gt; 0 -\u0026gt; 8 } Rust #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] fn new(val: i32) -\u0026gt; Self { ListNode { next: None, val } } } pub fn add_two_numbers( mut l1: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, mut l2: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, ) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let mut digits: Vec\u0026lt;i32\u0026gt; = Vec::new(); let mut carry = 0; while l1.is_some() || l2.is_some() || carry \u0026gt; 0 { let mut x = 0; let mut y = 0; if let Some(mut node) = l1 { x = node.val; l1 = node.next.take(); } else { l1 = None; } if let Some(mut node) = l2 { y = node.val; l2 = node.next.take(); } else { l2 = None; } let s = x + y + carry; carry = s / 10; digits.push(s % 10); } let mut head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; = None; let mut tail = \u0026amp;mut head; for d in digits { *tail = Some(Box::new(ListNode::new(d))); if let Some(node) = tail { tail = \u0026amp;mut node.next; } } head } fn build(nums: \u0026amp;[i32]) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let mut head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; = None; let mut tail = \u0026amp;mut head; for \u0026amp;n in nums { *tail = Some(Box::new(ListNode::new(n))); if let Some(node) = tail { tail = \u0026amp;mut node.next; } } head } fn dump(mut head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;) -\u0026gt; Vec\u0026lt;i32\u0026gt; { let mut out = Vec::new(); while let Some(mut node) = head { out.push(node.val); head = node.next.take(); } out } fn main() { let l1 = build(\u0026amp;[2, 4, 3]); let l2 = build(\u0026amp;[5, 6, 4]); let ans = add_two_numbers(l1, l2); println!(\u0026#34;{:?}\u0026#34;, dump(ans)); // [7, 0, 8] } JavaScript function ListNode(val = 0, next = null) { this.val = val; this.next = next; } function addTwoNumbers(l1, l2) { const dummy = new ListNode(0); let tail = dummy; let carry = 0; while (l1 !== null || l2 !== null || carry !== 0) { const x = l1 ? l1.val : 0; const y = l2 ? l2.val : 0; const s = x + y + carry; carry = Math.floor(s / 10); tail.next = new ListNode(s % 10); tail = tail.next; if (l1) l1 = l1.next; if (l2) l2 = l2.next; } return dummy.next; } function build(arr) { const dummy = new ListNode(); let tail = dummy; for (const v of arr) { tail.next = new ListNode(v); tail = tail.next; } return dummy.next; } function dump(head) { const out = []; while (head) { out.push(head.val); head = head.next; } return out; } const ans = addTwoNumbers(build([2, 4, 3]), build([5, 6, 4])); console.log(dump(ans)); // [7, 0, 8] CTA If you often get stuck on boundary conditions in this problem, do these two drills right now:\nRe-implement while l1 or l2 or carry from memory without looking. Then solve LeetCode 445 and compare forward-order vs reverse-order addition. You can also continue with LeetCode 25 or LeetCode 142 to strengthen linked-list pointer fundamentals.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/2-add-two-numbers/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis problem is just grade-school addition on a linked list: add one digit at a time, propagate carry, and append one final node if carry remains after both lists end. We move from naive ideas to the optimal one-pass solution, then map it to real engineering scenarios.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003ecarry\u003c/code\u003e, \u003ccode\u003esimulation\u003c/code\u003e, \u003ccode\u003eLeetCode 2\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Add Two Numbers, LeetCode 2, reverse-order list, carry, dummy node\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Use \u003ccode\u003edummy + tail + carry\u003c/code\u003e to sum two reverse-order linked lists in \u003ccode\u003eO(max(m,n))\u003c/code\u003e time, with common pitfalls, engineering analogies, and six-language runnable implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eBeginners building a stable template for linked-list problems\u003c/li\u003e\n\u003cli\u003eIntermediate developers who often miss carry or boundary cases\u003c/li\u003e\n\u003cli\u003eEngineers who want to transfer algorithmic thinking to stream-style data processing\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThis looks like an entry-level LeetCode problem, but it trains practical skills you will reuse:\u003c/p\u003e","title":"LeetCode 2: Add Two Numbers from Naive to Optimal Carry Simulation"},{"content":" Subtitle / Summary\nLeetCode 148 is not about whether you can sort; it is about choosing the right sorting strategy for linked-list constraints. For singly linked lists, merge sort fits naturally: split by middle, sort recursively, merge linearly.\nReading time: 12-16 min Tags: Hot100, linked list, merge sort, divide and conquer SEO keywords: Sort List, linked list merge sort, LeetCode 148, Hot100 Meta description: A practical ACERS guide for LeetCode 148 with derivation, complexity analysis, engineering mappings, and runnable code in multiple languages. Target Readers Hot100 learners building reusable linked-list templates Developers who struggle with split-and-reconnect pointer safety Engineers who want a clear answer to \u0026ldquo;why merge sort for linked lists\u0026rdquo; Background / Motivation Sorting linked structures appears in real systems:\npost-processing chained tasks by priority offline reordering of append-only linked logs memory-conscious restructuring with minimal copying If you directly copy array-sorting intuition to linked lists, you usually hit:\nno O(1) random access expensive and error-prone pointer shuffling for quicksort-style partitioning So this problem is fundamentally about algorithm-data-structure fit.\nCore Concepts Divide and Conquer: split list to subproblems, then merge upward Fast/slow middle finding: slow moves 1 step, fast moves 2 Linked-list merge: linear splice of two sorted sublists Stable sorting: equal keys keep relative order A - Algorithm (Problem and Algorithm) Problem Restatement Given the head of a linked list head, sort it in ascending order and return the sorted list. Required time complexity: O(n log n).\nInput / Output Name Type Description head ListNode head of a singly linked list (nullable) return ListNode head of sorted list Example 1 input: 4 -\u0026gt; 2 -\u0026gt; 1 -\u0026gt; 3 output: 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 Example 2 input: -1 -\u0026gt; 5 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 0 output: -1 -\u0026gt; 0 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5 Thought Process: From Naive to Optimal Naive approach: copy to array and sort Read values into array Use built-in sort Rebuild list Tradeoff:\nO(n) extra memory misses the core linked-list manipulation skill target Key observation Linked lists are good at:\ncutting (next = null) linear traversal splicing (next rewiring) This exactly matches merge sort:\nsplit around middle sort each half merge two sorted halves in linear time Method selection Use top-down merge sort on list:\nTime: O(n log n) Extra: recursion stack O(log n) clean, stable, and interview-practical C - Concepts (Core Ideas) Method Category linked-list divide-and-conquer sorting fast/slow split merge-template reuse (same pattern as LeetCode 21) Correctness intuition Base case: empty or single-node list is already sorted Induction: recursively sorted left/right halves are each sorted Merge: linear merge of two sorted lists remains sorted Therefore, the final result is sorted.\nComplexity recurrence T(n) = 2T(n/2) + O(n)\nBy Master theorem:\nT(n) = O(n log n) Practice Guide / Steps Return directly for 0/1 node list Find middle with fast/slow pointers and cut list into two halves Recursively sort left and right halves Merge two sorted halves with sentinel-node merge Return merged head Runnable Python example (sort_list.py):\nfrom typing import Optional class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def sort_list(head: Optional[ListNode]) -\u0026gt; Optional[ListNode]: if head is None or head.next is None: return head slow, fast = head, head.next while fast and fast.next: slow = slow.next fast = fast.next.next mid = slow.next slow.next = None left = sort_list(head) right = sort_list(mid) return merge(left, right) def merge(a: Optional[ListNode], b: Optional[ListNode]) -\u0026gt; Optional[ListNode]: dummy = ListNode() tail = dummy while a and b: if a.val \u0026lt;= b.val: tail.next, a = a, a.next else: tail.next, b = b, b.next tail = tail.next tail.next = a if a else b return dummy.next E - Engineering (Real-world Scenarios) Scenario 1: Task-chain reordering by priority (Go) Background: tasks are chained in insertion order but must run by priority.\nWhy it fits: linked-list split+merge avoids repeated array conversion.\ntype Task struct { Priority int Next *Task } func merge(a, b *Task) *Task { d := \u0026amp;Task{} t := d for a != nil \u0026amp;\u0026amp; b != nil { if a.Priority \u0026lt;= b.Priority { t.Next, a = a, a.Next } else { t.Next, b = b, b.Next } t = t.Next } if a != nil { t.Next = a } else { t.Next = b } return d.Next } Scenario 2: Offline chained log normalization (Python) Background: append-order logs need timestamp-order output for auditing.\nWhy it fits: merge-friendly linear passes scale predictably.\ndef merge_sorted_logs(a, b): i = j = 0 out = [] while i \u0026lt; len(a) and j \u0026lt; len(b): if a[i][0] \u0026lt;= b[j][0]: out.append(a[i]); i += 1 else: out.append(b[j]); j += 1 out.extend(a[i:]) out.extend(b[j:]) return out Scenario 3: Frontend incremental feed merge (JavaScript) Background: cached and remote pages are already sorted and need merged rendering.\nWhy it fits: merge gives deterministic linear behavior and stable ordering.\nfunction mergeByScore(a, b) { let i = 0, j = 0; const out = []; while (i \u0026lt; a.length \u0026amp;\u0026amp; j \u0026lt; b.length) { if (a[i].score \u0026lt;= b[j].score) out.push(a[i++]); else out.push(b[j++]); } while (i \u0026lt; a.length) out.push(a[i++]); while (j \u0026lt; b.length) out.push(b[j++]); return out; } R - Reflection (Complexity, Alternatives, Tradeoffs) Complexity Time: O(n log n) Extra space: O(log n) recursion stack Alternatives Method Time Space Notes Array copy + sort O(n log n) O(n) easy but loses list advantages List quicksort avg O(n log n), worst O(n²) O(log n) partitioning is awkward on list List merge sort (this) O(n log n) O(log n) stable and structure-friendly Common mistakes Forgetting slow.next = None, causing infinite recursion Wrong fast/slow initialization for even lengths Missing tail attachment in merge Pointer loss during recursive split Why this is the practical default It matches linked-list properties directly:\nno random access dependency linear work per recursion level reusable merge logic across multiple linked-list problems FAQ and Notes Can this be strict O(1) extra space?\nTop-down recursion uses O(log n) stack. Strict O(1) needs bottom-up iterative merge sort.\nWhy not quicksort?\nLinked-list partitioning is less natural and worst-case risk is higher.\nDoes stability matter?\nYes when equal keys carry extra business metadata.\nBest Practices Treat split+merge as a reusable helper template Test null, one node, even/odd lengths, and duplicate values Prioritize pointer correctness before micro-optimization Learn bottom-up merge sort after mastering this version S - Summary Merge sort is the best default for linked-list sorting Core workflow is split -\u0026gt; sort halves -\u0026gt; merge Complexity satisfies the target: O(n log n) This template generalizes to many list-based merge tasks Further Reading LeetCode 21. Merge Two Sorted Lists LeetCode 23. Merge k Sorted Lists LeetCode 147. Insertion Sort List LeetCode 25. Reverse Nodes in k-Group References https://leetcode.com/problems/sort-list/ https://en.cppreference.com/w/cpp/algorithm/stable_sort https://docs.python.org/3/howto/sorting.html https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html Meta Info Reading time: 12-16 min Tags: Hot100, linked list, merge sort, divide and conquer SEO keywords: Sort List, linked list merge sort, LeetCode 148, Hot100 Meta description: A complete linked-list merge-sort guide for LeetCode 148 with derivation, complexity, and multi-language code. CTA Two practical next steps:\nRe-implement recursive list merge sort from scratch without notes Build the bottom-up iterative variant and compare space tradeoffs Multi-language Implementations (Python / C / C++ / Go / Rust / JS) class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def sortList(head): if not head or not head.next: return head slow, fast = head, head.next while fast and fast.next: slow = slow.next fast = fast.next.next mid = slow.next slow.next = None left = sortList(head) right = sortList(mid) return merge(left, right) def merge(a, b): dummy = ListNode() t = dummy while a and b: if a.val \u0026lt;= b.val: t.next, a = a, a.next else: t.next, b = b, b.next t = t.next t.next = a if a else b return dummy.next typedef struct ListNode { int val; struct ListNode* next; } ListNode; static ListNode* merge(ListNode* a, ListNode* b) { ListNode dummy = {0, NULL}; ListNode* t = \u0026amp;dummy; while (a \u0026amp;\u0026amp; b) { if (a-\u0026gt;val \u0026lt;= b-\u0026gt;val) { t-\u0026gt;next = a; a = a-\u0026gt;next; } else { t-\u0026gt;next = b; b = b-\u0026gt;next; } t = t-\u0026gt;next; } t-\u0026gt;next = a ? a : b; return dummy.next; } ListNode* sortList(ListNode* head) { if (!head || !head-\u0026gt;next) return head; ListNode* slow = head; ListNode* fast = head-\u0026gt;next; while (fast \u0026amp;\u0026amp; fast-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; } ListNode* mid = slow-\u0026gt;next; slow-\u0026gt;next = NULL; return merge(sortList(head), sortList(mid)); } struct ListNode { int val; ListNode* next; ListNode(int x=0, ListNode* n=nullptr): val(x), next(n) {} }; class Solution { ListNode* merge(ListNode* a, ListNode* b) { ListNode dummy; ListNode* t = \u0026amp;dummy; while (a \u0026amp;\u0026amp; b) { if (a-\u0026gt;val \u0026lt;= b-\u0026gt;val) t-\u0026gt;next = a, a = a-\u0026gt;next; else t-\u0026gt;next = b, b = b-\u0026gt;next; t = t-\u0026gt;next; } t-\u0026gt;next = a ? a : b; return dummy.next; } public: ListNode* sortList(ListNode* head) { if (!head || !head-\u0026gt;next) return head; ListNode *slow = head, *fast = head-\u0026gt;next; while (fast \u0026amp;\u0026amp; fast-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; } ListNode* mid = slow-\u0026gt;next; slow-\u0026gt;next = nullptr; return merge(sortList(head), sortList(mid)); } }; type ListNode struct { Val int Next *ListNode } func sortList(head *ListNode) *ListNode { if head == nil || head.Next == nil { return head } slow, fast := head, head.Next for fast != nil \u0026amp;\u0026amp; fast.Next != nil { slow = slow.Next fast = fast.Next.Next } mid := slow.Next slow.Next = nil left := sortList(head) right := sortList(mid) return merge(left, right) } func merge(a, b *ListNode) *ListNode { dummy := \u0026amp;ListNode{} t := dummy for a != nil \u0026amp;\u0026amp; b != nil { if a.Val \u0026lt;= b.Val { t.Next = a a = a.Next } else { t.Next = b b = b.Next } t = t.Next } if a != nil { t.Next = a } else { t.Next = b } return dummy.Next } #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { fn new(val: i32) -\u0026gt; Self { Self { val, next: None } } } pub fn sort_list(head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let mut vals = Vec::new(); let mut cur = head; let mut p = cur; while let Some(mut node) = p { vals.push(node.val); p = node.next.take(); } vals.sort_unstable(); let mut ans = None; for v in vals.into_iter().rev() { let mut node = Box::new(ListNode::new(v)); node.next = ans; ans = Some(node); } ans } function ListNode(val, next = null) { this.val = val; this.next = next; } function sortList(head) { if (!head || !head.next) return head; let slow = head; let fast = head.next; while (fast \u0026amp;\u0026amp; fast.next) { slow = slow.next; fast = fast.next.next; } const mid = slow.next; slow.next = null; return merge(sortList(head), sortList(mid)); } function merge(a, b) { const dummy = new ListNode(0); let t = dummy; while (a \u0026amp;\u0026amp; b) { if (a.val \u0026lt;= b.val) { t.next = a; a = a.next; } else { t.next = b; b = b.next; } t = t.next; } t.next = a || b; return dummy.next; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/148-sort-list/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nLeetCode 148 is not about whether you can sort; it is about choosing the right sorting strategy for linked-list constraints. For singly linked lists, merge sort fits naturally: split by middle, sort recursively, merge linearly.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-16 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003emerge sort\u003c/code\u003e, \u003ccode\u003edivide and conquer\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Sort List, linked list merge sort, LeetCode 148, Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A practical ACERS guide for LeetCode 148 with derivation, complexity analysis, engineering mappings, and runnable code in multiple languages.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners building reusable linked-list templates\u003c/li\u003e\n\u003cli\u003eDevelopers who struggle with split-and-reconnect pointer safety\u003c/li\u003e\n\u003cli\u003eEngineers who want a clear answer to \u0026ldquo;why merge sort for linked lists\u0026rdquo;\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eSorting linked structures appears in real systems:\u003c/p\u003e","title":"Hot100: Sort List Linked-List Merge Sort ACERS Guide"},{"content":" Subtitle / Summary\nLeetCode 23 is a k-way merge problem, not just repeating LeetCode 21 in a loop. This ACERS guide derives the optimal structure, explains tradeoffs between divide-and-conquer and min-heap, and provides runnable implementations in multiple languages.\nReading time: 12-16 min Tags: Hot100, linked list, divide and conquer, merge SEO keywords: Merge K Sorted Lists, LeetCode 23, divide and conquer, O(N log k), Hot100 Meta description: A full ACERS explanation of Merge K Sorted Lists from naive ideas to O(N log k) divide-and-conquer, with engineering mapping and multi-language code. Target Readers Hot100 learners who have finished LeetCode 21 and want the next-level merge template Developers who need predictable performance for k-way ordered data merge Engineers preparing for linked-list and divide-and-conquer interview rounds Background / Motivation This problem appears in many production forms:\nmerge sorted outputs from multiple shards combine sorted event streams from multiple services aggregate sorted pagination slices The hard part is not correctness alone; it is controlling cost when k grows.\nCore Concepts N: total number of nodes across all lists k: number of lists Sequential merge: merge one list into the current result repeatedly Divide-and-conquer merge: merge in balanced rounds Min-heap k-way merge: keep current head from each list in a heap A - Algorithm (Problem and Algorithm) Problem Restatement Given an array lists of k sorted linked lists, merge them into one sorted linked list and return it.\nInput / Output Name Type Description lists ListNode[] k sorted lists, each can be null return ListNode merged sorted linked list head Example 1 input: lists = [[1,4,5],[1,3,4],[2,6]] output: [1,1,2,3,4,4,5,6] Example 2 input: lists = [] output: [] C - Concepts (Core Ideas) Thought Process: Naive -\u0026gt; Bottleneck -\u0026gt; Better Structure Flatten + sort\nO(N log N), O(N) extra space ignores that each list is already sorted Sequential merge\n(((l1 merge l2) merge l3) ...) repeated scanning of early nodes can degrade toward O(Nk) Key observation\nBalanced pairwise merges form a merge tree each level processes about N nodes number of levels is about log k Method choice\nDivide-and-conquer merge Time: O(N log k) Extra space: O(log k) recursion stack Method Category Divide and conquer Linked-list in-place splicing Same complexity class as heap-based k-way merge Correctness Invariant For interval [l, r]:\nsolve(l, r) returns a fully sorted merge of all lists in that interval if left and right halves are correctly merged, mergeTwo(left, right) preserves sorted order and includes every node exactly once Practice Guide / Steps Reuse a stable mergeTwo helper (LeetCode 21 template) Build recursive solve(l, r): base: l == r return lists[l] split at mid, solve both halves Merge results from both halves Start from full range [0, k-1] Runnable Python example (merge_k_lists.py):\nfrom typing import List, Optional class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def merge_two(a: Optional[ListNode], b: Optional[ListNode]) -\u0026gt; Optional[ListNode]: dummy = ListNode() tail = dummy while a and b: if a.val \u0026lt;= b.val: nxt = a.next tail.next = a a.next = None a = nxt else: nxt = b.next tail.next = b b.next = None b = nxt tail = tail.next tail.next = a if a else b return dummy.next def merge_k_lists(lists: List[Optional[ListNode]]) -\u0026gt; Optional[ListNode]: if not lists: return None def solve(l: int, r: int) -\u0026gt; Optional[ListNode]: if l == r: return lists[l] m = (l + r) // 2 return merge_two(solve(l, m), solve(m + 1, r)) return solve(0, len(lists) - 1) Explanation / Why This Works Balanced merging avoids repeatedly merging a huge partial result with small lists.\nwork per level: about N node operations number of levels: about log k So total time is O(N log k), while preserving in-place node reuse.\nE - Engineering (Real-world Scenarios) Scenario 1: Merge sorted shard timelines (Go) Background: each shard emits sorted events by timestamp.\nWhy it fits: divide-and-conquer merge is easy to parallelize by levels.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Ts int Next *Node } func mergeTwo(a, b *Node) *Node { dummy := \u0026amp;Node{} tail := dummy for a != nil \u0026amp;\u0026amp; b != nil { if a.Ts \u0026lt;= b.Ts { nxt := a.Next tail.Next = a a.Next = nil a = nxt } else { nxt := b.Next tail.Next = b b.Next = nil b = nxt } tail = tail.Next } if a != nil { tail.Next = a } else { tail.Next = b } return dummy.Next } func main() { a := \u0026amp;Node{1, \u0026amp;Node{4, \u0026amp;Node{9, nil}}} b := \u0026amp;Node{2, \u0026amp;Node{5, nil}} for p := mergeTwo(a, b); p != nil; p = p.Next { fmt.Print(p.Ts, \u0026#34; \u0026#34;) } fmt.Println() } Scenario 2: Offline merge of sorted rule outputs (Python) Background: multiple ranking pipelines output sorted IDs. Why it fits: divide-and-conquer gives stable performance for large k.\ndef merge_two(a, b): i = j = 0 out = [] while i \u0026lt; len(a) and j \u0026lt; len(b): if a[i] \u0026lt;= b[j]: out.append(a[i]); i += 1 else: out.append(b[j]); j += 1 out.extend(a[i:]) out.extend(b[j:]) return out def merge_k(arrays): if not arrays: return [] cur = arrays while len(cur) \u0026gt; 1: nxt = [] for i in range(0, len(cur), 2): if i + 1 \u0026lt; len(cur): nxt.append(merge_two(cur[i], cur[i + 1])) else: nxt.append(cur[i]) cur = nxt return cur[0] Scenario 3: Frontend unified feed from multiple sorted sources (JavaScript) Background: web app receives sorted cards from multiple APIs. Why it fits: deterministic merge order with no global re-sort.\nfunction mergeTwo(a, b) { let i = 0; let j = 0; const out = []; while (i \u0026lt; a.length \u0026amp;\u0026amp; j \u0026lt; b.length) { if (a[i].ts \u0026lt;= b[j].ts) out.push(a[i++]); else out.push(b[j++]); } while (i \u0026lt; a.length) out.push(a[i++]); while (j \u0026lt; b.length) out.push(b[j++]); return out; } R - Reflection (Complexity, Alternatives, Tradeoffs) Complexity Time: O(N log k) Space: O(log k) recursion stack Alternative Methods Method Time Space Notes Flatten + sort O(N log N) O(N) easiest, wastes structure Sequential merge near O(Nk) worst O(1) degrades as k grows Min-heap O(N log k) O(k) great for streaming inputs Divide-and-conquer O(N log k) O(log k) clean and reusable Common Mistakes Forgetting empty input handling (lists=[]) Assuming sequential merge is always efficient Losing nodes in mergeTwo pointer rewiring Incorrect base case in recursion Why this method is practical It balances performance and implementation simplicity. You can directly reuse LeetCode 21 helper logic and scale it to k-way merge.\nFAQ and Notes Divide-and-conquer or heap?\nBatch merge: divide-and-conquer is often cleaner. Streaming merge: heap is often better.\nCan this be fully in-place?\nYes, by rewiring next pointers (except helper/sentinel nodes).\nWhat if values repeat?\nNo issue. Use \u0026lt;= to keep stable tie behavior.\nBest Practices Treat mergeTwo as a shared utility function Never use sequential k-way merge for large k Validate with edge cases: empty lists, many null lists, uneven lengths Track k and N in production metrics for strategy switching S - Summary LeetCode 23 is a k-way merge scaling problem Divide-and-conquer reduces complexity to O(N log k) Heap and divide-and-conquer are both optimal in asymptotic time Mastering this pattern helps with many merge-based interview and production tasks Further Reading LeetCode 21. Merge Two Sorted Lists LeetCode 23. Merge K Sorted Lists LeetCode 148. Sort List LeetCode 632. Smallest Range Covering Elements from K Lists Conclusion The key upgrade from LeetCode 21 to 23 is structural thinking:\nmove from sequential accumulation to balanced reduction. This is a general engineering pattern for multi-source ordered data.\nReferences https://leetcode.com/problems/merge-k-sorted-lists/ https://docs.python.org/3/library/heapq.html https://en.cppreference.com/w/cpp/container/priority_queue https://pkg.go.dev/container/heap Meta Info Reading time: 12-16 min Tags: Hot100, linked list, divide and conquer, merge SEO keywords: Merge K Sorted Lists, LeetCode 23, O(N log k), Hot100 Meta description: End-to-end ACERS guide for LeetCode 23 with complexity derivation and multi-language implementations. CTA Implement mergeTwo + mergeK in your strongest language first, then re-implement in a second language. That cross-language pass is the fastest way to internalize pointer invariants.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import List, Optional class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def merge_two(a: Optional[ListNode], b: Optional[ListNode]) -\u0026gt; Optional[ListNode]: dummy = ListNode() tail = dummy while a and b: if a.val \u0026lt;= b.val: nxt = a.next tail.next = a a.next = None a = nxt else: nxt = b.next tail.next = b b.next = None b = nxt tail = tail.next tail.next = a if a else b return dummy.next def merge_k_lists(lists: List[Optional[ListNode]]) -\u0026gt; Optional[ListNode]: if not lists: return None def solve(l: int, r: int) -\u0026gt; Optional[ListNode]: if l == r: return lists[l] m = (l + r) // 2 return merge_two(solve(l, m), solve(m + 1, r)) return solve(0, len(lists) - 1) #include \u0026lt;stddef.h\u0026gt; typedef struct ListNode { int val; struct ListNode* next; } ListNode; ListNode* mergeTwo(ListNode* a, ListNode* b) { ListNode dummy; dummy.next = NULL; ListNode* tail = \u0026amp;dummy; while (a \u0026amp;\u0026amp; b) { if (a-\u0026gt;val \u0026lt;= b-\u0026gt;val) { ListNode* nxt = a-\u0026gt;next; tail-\u0026gt;next = a; a-\u0026gt;next = NULL; a = nxt; } else { ListNode* nxt = b-\u0026gt;next; tail-\u0026gt;next = b; b-\u0026gt;next = NULL; b = nxt; } tail = tail-\u0026gt;next; } tail-\u0026gt;next = a ? a : b; return dummy.next; } ListNode* solve(ListNode** lists, int l, int r) { if (l \u0026gt; r) return NULL; if (l == r) return lists[l]; int m = l + (r - l) / 2; ListNode* left = solve(lists, l, m); ListNode* right = solve(lists, m + 1, r); return mergeTwo(left, right); } ListNode* mergeKLists(ListNode** lists, int listsSize) { if (listsSize == 0) return NULL; return solve(lists, 0, listsSize - 1); } #include \u0026lt;vector\u0026gt; using namespace std; struct ListNode { int val; ListNode* next; ListNode(int x = 0, ListNode* n = nullptr) : val(x), next(n) {} }; class Solution { ListNode* mergeTwo(ListNode* a, ListNode* b) { ListNode dummy; ListNode* tail = \u0026amp;dummy; while (a \u0026amp;\u0026amp; b) { if (a-\u0026gt;val \u0026lt;= b-\u0026gt;val) { ListNode* nxt = a-\u0026gt;next; tail-\u0026gt;next = a; a-\u0026gt;next = nullptr; a = nxt; } else { ListNode* nxt = b-\u0026gt;next; tail-\u0026gt;next = b; b-\u0026gt;next = nullptr; b = nxt; } tail = tail-\u0026gt;next; } tail-\u0026gt;next = a ? a : b; return dummy.next; } ListNode* solve(vector\u0026lt;ListNode*\u0026gt;\u0026amp; lists, int l, int r) { if (l \u0026gt; r) return nullptr; if (l == r) return lists[l]; int m = l + (r - l) / 2; return mergeTwo(solve(lists, l, m), solve(lists, m + 1, r)); } public: ListNode* mergeKLists(vector\u0026lt;ListNode*\u0026gt;\u0026amp; lists) { if (lists.empty()) return nullptr; return solve(lists, 0, (int)lists.size() - 1); } }; package main type ListNode struct { Val int Next *ListNode } func mergeTwo(a, b *ListNode) *ListNode { dummy := \u0026amp;ListNode{} tail := dummy for a != nil \u0026amp;\u0026amp; b != nil { if a.Val \u0026lt;= b.Val { nxt := a.Next tail.Next = a a.Next = nil a = nxt } else { nxt := b.Next tail.Next = b b.Next = nil b = nxt } tail = tail.Next } if a != nil { tail.Next = a } else { tail.Next = b } return dummy.Next } func solve(lists []*ListNode, l, r int) *ListNode { if l \u0026gt; r { return nil } if l == r { return lists[l] } m := l + (r-l)/2 left := solve(lists, l, m) right := solve(lists, m+1, r) return mergeTwo(left, right) } func mergeKLists(lists []*ListNode) *ListNode { if len(lists) == 0 { return nil } return solve(lists, 0, len(lists)-1) } #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] fn new(val: i32) -\u0026gt; Self { ListNode { val, next: None } } } fn merge_two(a: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, b: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { match (a, b) { (None, x) =\u0026gt; x, (x, None) =\u0026gt; x, (Some(mut na), Some(mut nb)) =\u0026gt; { if na.val \u0026lt;= nb.val { let next = na.next.take(); na.next = merge_two(next, Some(nb)); Some(na) } else { let next = nb.next.take(); nb.next = merge_two(Some(na), next); Some(nb) } } } } fn solve(lists: \u0026amp;mut [Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;], l: usize, r: usize) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { if l == r { return lists[l].take(); } let m = (l + r) / 2; let left = solve(lists, l, m); let right = solve(lists, m + 1, r); merge_two(left, right) } pub fn merge_k_lists(mut lists: Vec\u0026lt;Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt;) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { if lists.is_empty() { return None; } let n = lists.len(); solve(\u0026amp;mut lists, 0, n - 1) } function ListNode(val, next = null) { this.val = val; this.next = next; } function mergeTwo(a, b) { const dummy = new ListNode(0); let tail = dummy; while (a \u0026amp;\u0026amp; b) { if (a.val \u0026lt;= b.val) { const nxt = a.next; tail.next = a; a.next = null; a = nxt; } else { const nxt = b.next; tail.next = b; b.next = null; b = nxt; } tail = tail.next; } tail.next = a || b; return dummy.next; } function solve(lists, l, r) { if (l \u0026gt; r) return null; if (l === r) return lists[l]; const m = (l + r) \u0026gt;\u0026gt; 1; return mergeTwo(solve(lists, l, m), solve(lists, m + 1, r)); } function mergeKLists(lists) { if (!lists || lists.length === 0) return null; return solve(lists, 0, lists.length - 1); } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/23-merge-k-sorted-lists/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nLeetCode 23 is a k-way merge problem, not just repeating LeetCode 21 in a loop. This ACERS guide derives the optimal structure, explains tradeoffs between divide-and-conquer and min-heap, and provides runnable implementations in multiple languages.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-16 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003edivide and conquer\u003c/code\u003e, \u003ccode\u003emerge\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Merge K Sorted Lists, LeetCode 23, divide and conquer, O(N log k), Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A full ACERS explanation of Merge K Sorted Lists from naive ideas to O(N log k) divide-and-conquer, with engineering mapping and multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who have finished LeetCode 21 and want the next-level merge template\u003c/li\u003e\n\u003cli\u003eDevelopers who need predictable performance for k-way ordered data merge\u003c/li\u003e\n\u003cli\u003eEngineers preparing for linked-list and divide-and-conquer interview rounds\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThis problem appears in many production forms:\u003c/p\u003e","title":"Hot100: Merge K Sorted Lists Divide-and-Conquer O(N log k) ACERS Guide"},{"content":" Subtitle / Summary\nLeetCode 142 upgrades cycle detection into cycle entry localization. The robust template is Floyd: first detect a meeting inside the cycle, then reset one pointer to head and move both by one step; the next meeting node is the cycle entry.\nReading time: 12-16 min Tags: Hot100, linked list, fast slow pointers, Floyd SEO keywords: Linked List Cycle II, cycle entry, Floyd, fast slow pointers, O(1) space, LeetCode 142, Hot100 Meta description: Floyd cycle detection + entry localization with proof intuition, engineering mapping, and runnable multi-language implementations in O(n) time and O(1) extra space. Target Readers Hot100 learners who want to fully internalize the 141 -\u0026gt; 142 linked-list template family Developers who need to locate where a pointer chain becomes cyclic Interview candidates who want to explain why \u0026ldquo;reset to head\u0026rdquo; works Background / Motivation In real systems, cycle corruption in chain structures can cause:\nendless traversal loops stuck cleanup tasks misleadingly stable but non-progressing runtime behavior Detecting whether a cycle exists is helpful, but operations/debugging usually require more:\nwhere does the cycle begin? That exact requirement is modeled by LeetCode 142.\nCore Concepts Concept Meaning Why it matters Cycle Following next eventually revisits a node causes non-terminating traversals Entry node First node where linear prefix enters the loop required return value Floyd algorithm slow moves 1 step, fast moves 2 steps O(1) extra memory Meeting point First collision inside cycle bridge to entry localization Identity equality compare node reference, not node value value duplicates are common A - Algorithm (Problem and Algorithm) Problem Restatement Given head of a singly linked list, return the node where the cycle begins. If there is no cycle, return null.\nNotes:\npos in the statement is only for test-data construction pos is not a function argument list structure must not be modified Input / Output Name Type Description head ListNode head of singly linked list return ListNode / null entry node reference, or null Example 1 head: 3 -\u0026gt; 2 -\u0026gt; 0 -\u0026gt; -4 ^ | |_________| output: node(2) Example 2 head: 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; null output: null Thought Process: From Hashing to Floyd + Reset Naive approach: visited set Traverse nodes and store references in a hash set:\nif current node already exists in set, it is the entry if traversal reaches null, there is no cycle Pros: straightforward. Cons: O(n) extra space.\nConstraint-driven upgrade Need O(1) extra memory while still returning the entry node.\nKey observation Floyd gives two phases:\nDetection: if slow and fast meet, cycle exists Localization: move one pointer to head, keep the other at meeting point; step both by 1 until they meet again Second meeting point is exactly the cycle entry.\nC - Concepts (Core Ideas) Method Category Two pointers (fast/slow) Floyd cycle detection Distance alignment after reset Why reset-to-head works Define:\na: distance from head to cycle entry b: distance from entry to first meeting point c: cycle length At first meeting:\nslow traveled a + b fast traveled 2(a + b) Difference is one whole number of cycle lengths:\n2(a + b) - (a + b) = k * c =\u0026gt; a + b = k * c =\u0026gt; a = k * c - b Meaning:\nfrom head to entry: a steps from meeting point to entry: also a steps modulo cycle length So moving one pointer from head and one from meeting, both one step each round, guarantees meeting at the entry.\nInvariant for localization phase During phase 2, both pointers have equal remaining distance (modulo cycle) to entry. Equal-speed synchronous movement preserves this equality, so collision point is entry.\nPractical Guide / Steps Initialize slow = head, fast = head Detection loop: slow = slow.next fast = fast.next.next if pointers meet, break If fast == null or fast.next == null, no cycle -\u0026gt; return null Set p1 = head, p2 = meeting Move both by one step until p1 == p2 Return p1 (entry) Runnable Example (Python) from typing import Optional class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def detect_cycle(head: Optional[ListNode]) -\u0026gt; Optional[ListNode]: slow = head fast = head while fast and fast.next: slow = slow.next fast = fast.next.next if slow is fast: p1 = head p2 = slow while p1 is not p2: p1 = p1.next p2 = p2.next return p1 return None def build_cycle_list(values, pos): dummy = ListNode() tail = dummy entry = None for idx, x in enumerate(values): tail.next = ListNode(x) tail = tail.next if idx == pos: entry = tail if tail and pos \u0026gt;= 0: tail.next = entry return dummy.next if __name__ == \u0026#34;__main__\u0026#34;: h = build_cycle_list([3, 2, 0, -4], 1) e = detect_cycle(h) print(e.val if e else None) # 2 Explanation / Why This Works The algorithm is split by responsibility:\nphase 1 answers: cycle or no cycle phase 2 answers: where cycle starts This separation is important for correctness and debugging.\nSetting one pointer to head is not a trick; it is distance alignment derived from meeting-point equations. That is why this method is both memory-optimal and proof-friendly.\nE - Engineering (Real-world Scenarios) Scenario 1: asynchronous callback chain corruption check (Go) Background: callback chain unexpectedly loops in production.\nWhy it fits: no extra map allocation, works on large chains.\npackage main type Node struct { Val int Next *Node } func detectCycle(head *Node) *Node { slow, fast := head, head for fast != nil \u0026amp;\u0026amp; fast.Next != nil { slow = slow.Next fast = fast.Next.Next if slow == fast { p1, p2 := head, slow for p1 != p2 { p1 = p1.Next p2 = p2.Next } return p1 } } return nil } Scenario 2: ETL pointer-chain anomaly localization (Python) Background: transformed record chain can accidentally self-link.\nWhy it fits: allows locating the first bad join node directly.\n# Reuse detect_cycle(head) above and log entry node identity/value. Scenario 3: front-end linked-state graph guard (JavaScript) Background: linked state nodes in memory may form unintended cycles. Why it fits: fast runtime check in debug tooling.\nfunction detectCycle(head) { let slow = head; let fast = head; while (fast \u0026amp;\u0026amp; fast.next) { slow = slow.next; fast = fast.next.next; if (slow === fast) { let p1 = head; let p2 = slow; while (p1 !== p2) { p1 = p1.next; p2 = p2.next; } return p1; } } return null; } R - Reflection Complexity Time: O(n) Extra space: O(1) Alternatives and Tradeoffs Method Time Space Notes visited hash set O(n) O(n) easy, but extra memory Floyd + reset O(n) O(1) optimal for constraints Common Mistakes forgetting fast \u0026amp;\u0026amp; fast.next null checks comparing node values (val) instead of references returning meeting point directly (meeting is not always entry) modifying list to mark visited nodes (forbidden by problem) Why this method is optimal in practice no structural mutation no extra memory pressure deterministic behavior under large lists strong proof story for interviews and code review FAQ and Notes Why is the first meeting point not necessarily the entry?\nBecause fast and slow can first collide anywhere inside the cycle.\nCan this fail with duplicate values?\nNo, if you compare references (is, == pointer) rather than values.\nWhat if list length is very small?\nNull checks naturally handle 0 or 1 node lists.\nCan we stop once cycle is detected?\nFor LeetCode 141 yes; for 142 you must run localization phase.\nBest Practices Implement 141 and 142 as a pair to reuse mental model Keep phase separation explicit in code (detect then locate) Add tests for: no cycle single-node self-cycle cycle entry at head cycle entry in middle In production debug tools, print entry node identity and predecessor info when possible S - Summary LeetCode 142 extends cycle detection to entry localization Floyd detects cycle with O(1) memory Reset-to-head works by distance alignment, not memorized magic Correct implementation depends on reference comparison and null-safe traversal This pattern is reusable for many chain-integrity diagnostics Recommended Follow-up LeetCode 141 — Linked List Cycle LeetCode 160 — Intersection of Two Linked Lists Floyd cycle-finding notes in pointer-heavy systems General linked-list invariants and mutation safety patterns Conclusion Once you understand \u0026ldquo;detect first, then reset and align distances\u0026rdquo;, Linked List Cycle II becomes a stable template rather than a memorized trick.\nReferences https://leetcode.com/problems/linked-list-cycle-ii/ https://en.wikipedia.org/wiki/Cycle_detection https://en.cppreference.com/w/cpp/language/pointer https://go.dev/doc/effective_go Meta Info Reading time: 12-16 min Tags: Hot100, linked list, fast/slow pointers, Floyd SEO keywords: Linked List Cycle II, cycle entry, Floyd, LeetCode 142 Meta description: O(n)/O(1) cycle entry localization with Floyd fast/slow pointers and reset alignment proof. Call To Action (CTA) Run this mini drill:\nRe-implement 141 and 142 back-to-back from memory Write the distance equation once (a+b = k*c) and explain it aloud Build four edge-case tests and verify pointer identity-based assertions Multi-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import Optional class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def detect_cycle(head: Optional[ListNode]) -\u0026gt; Optional[ListNode]: slow = head fast = head while fast and fast.next: slow = slow.next fast = fast.next.next if slow is fast: p1 = head p2 = slow while p1 is not p2: p1 = p1.next p2 = p2.next return p1 return None struct ListNode { int val; struct ListNode *next; }; struct ListNode *detectCycle(struct ListNode *head) { struct ListNode *slow = head; struct ListNode *fast = head; while (fast \u0026amp;\u0026amp; fast-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; if (slow == fast) { struct ListNode *p1 = head; struct ListNode *p2 = slow; while (p1 != p2) { p1 = p1-\u0026gt;next; p2 = p2-\u0026gt;next; } return p1; } } return NULL; } class Solution { public: ListNode *detectCycle(ListNode *head) { ListNode *slow = head; ListNode *fast = head; while (fast \u0026amp;\u0026amp; fast-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; if (slow == fast) { ListNode *p1 = head; ListNode *p2 = slow; while (p1 != p2) { p1 = p1-\u0026gt;next; p2 = p2-\u0026gt;next; } return p1; } } return nullptr; } }; func detectCycle(head *ListNode) *ListNode { slow, fast := head, head for fast != nil \u0026amp;\u0026amp; fast.Next != nil { slow = slow.Next fast = fast.Next.Next if slow == fast { p1, p2 := head, slow for p1 != p2 { p1 = p1.Next p2 = p2.Next } return p1 } } return nil } use std::cell::RefCell; use std::rc::Rc; type Link = Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt;; #[derive(Debug)] pub struct ListNode { pub val: i32, pub next: Link, } pub fn detect_cycle(head: Link) -\u0026gt; Link { let mut slow = head.clone(); let mut fast = head.clone(); loop { slow = match slow.clone() { Some(node) =\u0026gt; node.borrow().next.clone(), None =\u0026gt; return None, }; fast = match fast.clone() { Some(node) =\u0026gt; match node.borrow().next.clone() { Some(next1) =\u0026gt; next1.borrow().next.clone(), None =\u0026gt; return None, }, None =\u0026gt; return None, }; match (slow.clone(), fast.clone()) { (Some(s), Some(f)) if Rc::ptr_eq(\u0026amp;s, \u0026amp;f) =\u0026gt; break, (Some(_), Some(_)) =\u0026gt; {} _ =\u0026gt; return None, } } let mut p1 = head; let mut p2 = slow; loop { match (p1.clone(), p2.clone()) { (Some(a), Some(b)) =\u0026gt; { if Rc::ptr_eq(\u0026amp;a, \u0026amp;b) { return Some(a); } p1 = a.borrow().next.clone(); p2 = b.borrow().next.clone(); } _ =\u0026gt; return None, } } } function detectCycle(head) { let slow = head; let fast = head; while (fast \u0026amp;\u0026amp; fast.next) { slow = slow.next; fast = fast.next.next; if (slow === fast) { let p1 = head; let p2 = slow; while (p1 !== p2) { p1 = p1.next; p2 = p2.next; } return p1; } } return null; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/142-linked-list-cycle-ii/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nLeetCode 142 upgrades cycle detection into cycle entry localization. The robust template is Floyd: first detect a meeting inside the cycle, then reset one pointer to \u003ccode\u003ehead\u003c/code\u003e and move both by one step; the next meeting node is the cycle entry.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-16 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003efast slow pointers\u003c/code\u003e, \u003ccode\u003eFloyd\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Linked List Cycle II, cycle entry, Floyd, fast slow pointers, O(1) space, LeetCode 142, Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Floyd cycle detection + entry localization with proof intuition, engineering mapping, and runnable multi-language implementations in O(n) time and O(1) extra space.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want to fully internalize the \u003ccode\u003e141 -\u0026gt; 142\u003c/code\u003e linked-list template family\u003c/li\u003e\n\u003cli\u003eDevelopers who need to locate where a pointer chain becomes cyclic\u003c/li\u003e\n\u003cli\u003eInterview candidates who want to explain why \u0026ldquo;reset to head\u0026rdquo; works\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eIn real systems, cycle corruption in chain structures can cause:\u003c/p\u003e","title":"Hot100: Linked List Cycle II Floyd Detection + Entry Localization ACERS Guide"},{"content":" Subtitle / Summary\nThis problem is the linked-list version of merge-sort\u0026rsquo;s merge step. Use a sentinel node plus two pointers to splice nodes in ascending order in O(m+n), without rebuilding the list.\nReading time: 10-12 min Tags: Hot100, linked list, merge, two pointers SEO keywords: Merge Two Sorted Lists, sentinel node, linked list merge, LeetCode 21, Hot100 Meta description: A complete ACERS guide for LeetCode 21 with derivation, correctness invariants, pitfalls, and runnable multi-language code. Target Readers Hot100 learners preparing linked-list interview templates Developers who often lose nodes while rewiring next Engineers who need stable O(1)-extra-space merge patterns Background / Motivation This is a small problem with large transfer value:\nIt is a direct building block of merge k sorted lists It reinforces pointer safety under in-place rewiring It mirrors real-world merging of two already sorted streams If this template is stable in your hands, many linked-list and divide-and-conquer problems become easier.\nCore Concepts Sorted linked list: non-decreasing values along next Splicing merge: reuse original nodes by rewiring pointers Sentinel (dummy) node: avoids special handling for the head of result Tail pointer: always points to the last node in merged list A - Algorithm (Problem and Algorithm) Problem Restatement Given heads list1 and list2 of two sorted linked lists, merge them into one sorted linked list and return its head. The merged list should be formed by splicing together nodes from the original lists.\nInput / Output Name Type Description list1 ListNode Head of sorted list 1 (nullable) list2 ListNode Head of sorted list 2 (nullable) return ListNode Head of merged sorted list Example 1 list1: 1 -\u0026gt; 2 -\u0026gt; 4 list2: 1 -\u0026gt; 3 -\u0026gt; 4 output: 1 -\u0026gt; 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 4 Example 2 list1: null list2: 0 -\u0026gt; 5 output: 0 -\u0026gt; 5 Thought Process: From Naive to Optimal Naive approach: flatten + sort + rebuild Read values from both lists into array Sort the array Recreate a new linked list Problems:\nO(m+n) extra space violates the spirit of \u0026ldquo;splice original nodes\u0026rdquo; Key observation Both lists are already sorted. At each step, the next smallest node must be one of the two current heads.\nSo we can:\ncompare current nodes append smaller one to result tail move that list pointer forward Method choice Use sentinel + two pointers:\nO(m+n) time O(1) extra space (excluding sentinel node) stable and interview-friendly C - Concepts (Core Ideas) Method Category Two-pointer merge In-place linked-list splicing Sentinel node pattern Loop Invariant Before each iteration:\ndummy.next ... tail is already sorted p1 and p2 point to the first unmerged nodes in each list All nodes before p1 and p2 have been merged exactly once After appending the smaller head, invariants still hold. When one list ends, append the rest of the other list directly.\nWhy appending the remainder is safe If p1 is null, all remaining nodes in p2 are already in sorted order and all are \u0026gt;= tail.val. So attaching tail.next = p2 preserves sorted order.\nPractice Guide / Steps Create dummy and set tail = dummy Initialize p1 = list1, p2 = list2 While both are non-null: if p1.val \u0026lt;= p2.val, append p1 else append p2 move tail Append the non-null remainder Return dummy.next Runnable Python example (merge_two_lists.py):\nfrom typing import List, Optional class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def merge_two_lists(list1: Optional[ListNode], list2: Optional[ListNode]) -\u0026gt; Optional[ListNode]: dummy = ListNode(0) tail = dummy p1, p2 = list1, list2 while p1 is not None and p2 is not None: if p1.val \u0026lt;= p2.val: nxt = p1.next tail.next = p1 p1.next = None p1 = nxt else: nxt = p2.next tail.next = p2 p2.next = None p2 = nxt tail = tail.next tail.next = p1 if p1 is not None else p2 return dummy.next def from_list(arr: List[int]) -\u0026gt; Optional[ListNode]: dummy = ListNode() cur = dummy for x in arr: cur.next = ListNode(x) cur = cur.next return dummy.next def to_list(head: Optional[ListNode]) -\u0026gt; List[int]: out: List[int] = [] while head: out.append(head.val) head = head.next return out if __name__ == \u0026#34;__main__\u0026#34;: l1 = from_list([1, 2, 4]) l2 = from_list([1, 3, 4]) print(to_list(merge_two_lists(l1, l2))) print(to_list(merge_two_lists(None, from_list([0, 5])))) Explanation / Why This Works Each step picks the globally smallest remaining node between the two list heads. That is exactly the merge-sort merge principle.\nBecause every node is moved once, and no node is revisited:\ntime is linear in total node count space is constant (pointer variables + sentinel) E - Engineering (Real-world Scenarios) Scenario 1: Merge two ordered event streams (Go) Background: backend services often merge logs/events by timestamp. Why it fits: both inputs are already sorted; linear merge minimizes latency.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Ts int Next *Node } func merge(a, b *Node) *Node { dummy := \u0026amp;Node{} tail := dummy for a != nil \u0026amp;\u0026amp; b != nil { if a.Ts \u0026lt;= b.Ts { tail.Next = a a = a.Next } else { tail.Next = b b = b.Next } tail = tail.Next } if a != nil { tail.Next = a } else { tail.Next = b } return dummy.Next } func main() { a := \u0026amp;Node{1, \u0026amp;Node{3, \u0026amp;Node{7, nil}}} b := \u0026amp;Node{2, \u0026amp;Node{4, \u0026amp;Node{8, nil}}} for p := merge(a, b); p != nil; p = p.Next { fmt.Print(p.Ts, \u0026#34; \u0026#34;) } fmt.Println() } Scenario 2: Offline merge of sorted ID sets (Python) Background: analytics jobs frequently merge sorted outputs from two rules. Why it fits: pointer-based merge avoids expensive global sort for pre-sorted sources.\ndef merge_sorted(a, b): i = j = 0 out = [] while i \u0026lt; len(a) and j \u0026lt; len(b): if a[i] \u0026lt;= b[j]: out.append(a[i]); i += 1 else: out.append(b[j]); j += 1 out.extend(a[i:]) out.extend(b[j:]) return out print(merge_sorted([1, 3, 7], [2, 4, 8])) Scenario 3: Frontend timeline composition from two sorted feeds (JavaScript) Background: one feed from local cache and one from server response. Why it fits: stable merge keeps timeline sorted with predictable complexity.\nfunction mergeSortedFeeds(a, b) { let i = 0; let j = 0; const out = []; while (i \u0026lt; a.length \u0026amp;\u0026amp; j \u0026lt; b.length) { if (a[i].ts \u0026lt;= b[j].ts) out.push(a[i++]); else out.push(b[j++]); } while (i \u0026lt; a.length) out.push(a[i++]); while (j \u0026lt; b.length) out.push(b[j++]); return out; } console.log(mergeSortedFeeds([{ ts: 1 }, { ts: 5 }], [{ ts: 2 }, { ts: 4 }])); R - Reflection (Complexity, Alternatives, Tradeoffs) Complexity Time: O(m+n) Space: O(1) extra (iterative pointer solution) Alternatives Method Time Extra Space Notes Flatten + sort + rebuild O((m+n)log(m+n)) O(m+n) easy but not in-place Recursive merge O(m+n) O(m+n) stack worst-case concise but stack risk Sentinel iterative merge O(m+n) O(1) most practical in interviews and systems code Common mistakes Forgetting to move tail after attachment Losing list remainder by not attaching final non-null list Mishandling null inputs Creating new nodes unnecessarily Why this is the best practical method It matches constraints and is robust:\nlinear no extra container simple invariant-based correctness FAQ and Notes Do we need to allocate new nodes?\nNo, splicing existing nodes is enough.\nIs recursion acceptable?\nFunctionally yes, but iterative is safer for long lists.\nWhat if equal values appear?\nUse \u0026lt;= for stable preference from list1.\nBest Practices Always use dummy + tail for list-construction tasks Keep pointer updates in one consistent order Test edge cases: empty lists, one empty, all equal values Reuse this merge as a helper for merge k lists and list sorting S - Summary Merge Two Sorted Lists is exactly merge-sort\u0026rsquo;s merge step on linked lists Sentinel node removes head special-case complexity Two-pointer iterative merge is O(m+n) time and O(1) extra space This template is foundational for many advanced linked-list problems Further Reading LeetCode 21. Merge Two Sorted Lists LeetCode 23. Merge k Sorted Lists LeetCode 148. Sort List LeetCode 206. Reverse Linked List Conclusion If you can write this merge in one pass without pointer mistakes, your linked-list fundamentals are in good shape. Practice this with merge k lists next to make the pattern production-ready.\nReferences https://leetcode.com/problems/merge-two-sorted-lists/ https://en.cppreference.com/w/cpp/container/forward_list https://docs.python.org/3/tutorial/datastructures.html https://doc.rust-lang.org/std/option/ Meta Info Reading time: 10-12 min Tags: Hot100, linked list, merge, two pointers SEO keywords: Merge Two Sorted Lists, LeetCode 21, sentinel node, O(m+n) Meta description: A practical ACERS guide to sentinel-node linked-list merge with multi-language runnable code. CTA Try implementing this in under 10 minutes in your primary language. Then extend it to merge k sorted lists to lock in the divide-and-conquer upgrade path.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def merge_two_lists(list1, list2): dummy = ListNode(0) tail = dummy p1, p2 = list1, list2 while p1 and p2: if p1.val \u0026lt;= p2.val: nxt = p1.next tail.next = p1 p1.next = None p1 = nxt else: nxt = p2.next tail.next = p2 p2.next = None p2 = nxt tail = tail.next tail.next = p1 if p1 else p2 return dummy.next #include \u0026lt;stddef.h\u0026gt; typedef struct ListNode { int val; struct ListNode* next; } ListNode; ListNode* mergeTwoLists(ListNode* list1, ListNode* list2) { ListNode dummy; dummy.next = NULL; ListNode* tail = \u0026amp;dummy; while (list1 \u0026amp;\u0026amp; list2) { if (list1-\u0026gt;val \u0026lt;= list2-\u0026gt;val) { ListNode* nxt = list1-\u0026gt;next; tail-\u0026gt;next = list1; list1-\u0026gt;next = NULL; list1 = nxt; } else { ListNode* nxt = list2-\u0026gt;next; tail-\u0026gt;next = list2; list2-\u0026gt;next = NULL; list2 = nxt; } tail = tail-\u0026gt;next; } tail-\u0026gt;next = list1 ? list1 : list2; return dummy.next; } struct ListNode { int val; ListNode* next; ListNode(int x = 0, ListNode* n = nullptr) : val(x), next(n) {} }; class Solution { public: ListNode* mergeTwoLists(ListNode* list1, ListNode* list2) { ListNode dummy; ListNode* tail = \u0026amp;dummy; while (list1 \u0026amp;\u0026amp; list2) { if (list1-\u0026gt;val \u0026lt;= list2-\u0026gt;val) { ListNode* nxt = list1-\u0026gt;next; tail-\u0026gt;next = list1; list1-\u0026gt;next = nullptr; list1 = nxt; } else { ListNode* nxt = list2-\u0026gt;next; tail-\u0026gt;next = list2; list2-\u0026gt;next = nullptr; list2 = nxt; } tail = tail-\u0026gt;next; } tail-\u0026gt;next = list1 ? list1 : list2; return dummy.next; } }; package main type ListNode struct { Val int Next *ListNode } func mergeTwoLists(list1 *ListNode, list2 *ListNode) *ListNode { dummy := \u0026amp;ListNode{} tail := dummy for list1 != nil \u0026amp;\u0026amp; list2 != nil { if list1.Val \u0026lt;= list2.Val { nxt := list1.Next tail.Next = list1 list1.Next = nil list1 = nxt } else { nxt := list2.Next tail.Next = list2 list2.Next = nil list2 = nxt } tail = tail.Next } if list1 != nil { tail.Next = list1 } else { tail.Next = list2 } return dummy.Next } #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] fn new(val: i32) -\u0026gt; Self { ListNode { val, next: None } } } pub fn merge_two_lists( mut list1: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, mut list2: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, ) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let mut dummy = Box::new(ListNode::new(0)); let mut tail = \u0026amp;mut dummy; while list1.is_some() \u0026amp;\u0026amp; list2.is_some() { let take_left = list1.as_ref().unwrap().val \u0026lt;= list2.as_ref().unwrap().val; if take_left { let mut node = list1.take().unwrap(); list1 = node.next.take(); tail.next = Some(node); } else { let mut node = list2.take().unwrap(); list2 = node.next.take(); tail.next = Some(node); } tail = tail.next.as_mut().unwrap(); } tail.next = if list1.is_some() { list1 } else { list2 }; dummy.next } function ListNode(val, next = null) { this.val = val; this.next = next; } function mergeTwoLists(list1, list2) { const dummy = new ListNode(0); let tail = dummy; let p1 = list1; let p2 = list2; while (p1 \u0026amp;\u0026amp; p2) { if (p1.val \u0026lt;= p2.val) { const nxt = p1.next; tail.next = p1; p1.next = null; p1 = nxt; } else { const nxt = p2.next; tail.next = p2; p2.next = null; p2 = nxt; } tail = tail.next; } tail.next = p1 || p2; return dummy.next; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/21-merge-two-sorted-lists/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis problem is the linked-list version of merge-sort\u0026rsquo;s merge step. Use a sentinel node plus two pointers to splice nodes in ascending order in O(m+n), without rebuilding the list.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003emerge\u003c/code\u003e, \u003ccode\u003etwo pointers\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Merge Two Sorted Lists, sentinel node, linked list merge, LeetCode 21, Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A complete ACERS guide for LeetCode 21 with derivation, correctness invariants, pitfalls, and runnable multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners preparing linked-list interview templates\u003c/li\u003e\n\u003cli\u003eDevelopers who often lose nodes while rewiring \u003ccode\u003enext\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003eEngineers who need stable O(1)-extra-space merge patterns\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThis is a small problem with large transfer value:\u003c/p\u003e","title":"Hot100: Merge Two Sorted Lists Sentinel Two-Pointer Merge ACERS Guide"},{"content":" Subtitle / Summary\nFirst Missing Positive is a classic in-place indexing problem. Place each valid value x into slot x-1, then scan for the first mismatch. This ACERS guide explains the derivation, invariant, pitfalls, and production-style transfer.\nReading time: 12-15 min Tags: Hot100, array, in-place hashing SEO keywords: First Missing Positive, in-place hashing, index mapping, O(n), Hot100, LeetCode 41 Meta description: O(n)/O(1) solution for First Missing Positive using in-place index placement, with complexity analysis, engineering scenarios, and runnable multi-language code. Target Readers Hot100 learners building stable array templates Intermediate developers who want to master in-place indexing techniques Engineers who need linear-time, constant-space array normalization Background / Motivation \u0026ldquo;Find the smallest missing positive\u0026rdquo; is fundamentally a placement problem.\nIf value x is present and 1 \u0026lt;= x \u0026lt;= n, then in an ideal arrangement it should sit at index x-1. Once this structure is built, the answer is just the first index where the rule breaks.\nThe challenge is constraint-driven:\nO(n) time O(1) extra space That forces us to avoid sorting (O(n log n)) and extra hash sets (O(n) space), and use in-place swapping instead.\nCore Concepts Concept Meaning Why it matters In-place hashing Use array indices as hash buckets keeps extra space O(1) Index placement value x belongs to index x-1 transforms search into scan Swap-to-place repeatedly swap until stable enables O(n) normalization A - Algorithm (Problem and Algorithm) Problem Restatement Given an unsorted integer array nums, return the smallest missing positive integer. You must design an algorithm with O(n) time and O(1) extra space.\nInput / Output Name Type Description nums int[] unsorted integer array return int smallest missing positive integer Example 1 input: nums = [1, 2, 0] output: 3 Example 2 input: nums = [3, 4, -1, 1] output: 2 High-Level Procedure For each index i, keep swapping until nums[i] is either invalid or already in correct slot. After placement, scan from left to right. First index i where nums[i] != i + 1 gives answer i + 1. If all match, answer is n + 1. Thought Process: From Sorting/Hashing to In-Place Placement Naive option 1: brute-force candidate test Try 1, 2, 3, ... and scan the array each time. This is O(n^2), not acceptable for large inputs.\nNaive option 2: sort then scan Sorting makes order visible, but time becomes O(n log n), violating O(n) requirement.\nNaive option 3: hash set Hash set gives O(n) time, but costs O(n) extra memory, violating O(1) space.\nKey observation Only values in range [1, n] can affect the answer within [1, n+1]. So we can treat the array itself as buckets:\nvalue x should be placed at index x - 1 Once placement converges, the first broken bucket reveals the answer.\nC - Concepts (Core Ideas) Method Category In-place hashing (index-as-hash) Array normalization via swapping Linear validation scan Core Invariant After placement phase:\nif positive integer k exists in array and 1 \u0026lt;= k \u0026lt;= n, then ideally nums[k-1] == k first index i with nums[i] != i+1 means value i+1 is missing Why repeated swaps are still linear Each successful swap places at least one value into its final position. A value can be moved only a limited number of times before becoming stable. Total swaps across whole array remain O(n).\nPractical Guide / Steps Let n = len(nums), set pointer i = 0 While i \u0026lt; n: v = nums[i] if 1 \u0026lt;= v \u0026lt;= n and nums[v-1] != v, swap nums[i] with nums[v-1] else i += 1 Run a second scan: first i where nums[i] != i+1, return i+1 If no mismatch, return n+1 Runnable Example (Python) from typing import List def first_missing_positive(nums: List[int]) -\u0026gt; int: n = len(nums) i = 0 while i \u0026lt; n: v = nums[i] if 1 \u0026lt;= v \u0026lt;= n and nums[v - 1] != v: nums[i], nums[v - 1] = nums[v - 1], nums[i] else: i += 1 for i, v in enumerate(nums): if v != i + 1: return i + 1 return n + 1 if __name__ == \u0026#34;__main__\u0026#34;: print(first_missing_positive([1, 2, 0])) print(first_missing_positive([3, 4, -1, 1])) Run:\npython3 first_missing_positive.py Explanation / Why This Works This algorithm converts a search problem into a placement problem.\nValues outside [1, n] are irrelevant for first missing positive in [1, n+1] Valid value x is forced toward slot x-1 Placement phase builds a partially sorted-by-meaning layout Scan phase finds the first missing positive in one pass So we satisfy both constraints simultaneously: O(n) time and O(1) extra memory.\nE - Engineering (Real-world Scenarios) Scenario 1: next available compact ID in analytics batch (Python) Background: find smallest missing positive ID in a daily import batch.\nWhy it fits: no extra structures needed for large in-memory batches.\ndef next_missing_id(ids): n = len(ids) i = 0 while i \u0026lt; n: v = ids[i] if 1 \u0026lt;= v \u0026lt;= n and ids[v - 1] != v: ids[i], ids[v - 1] = ids[v - 1], ids[i] else: i += 1 for idx, val in enumerate(ids): if val != idx + 1: return idx + 1 return n + 1 print(next_missing_id([2, 1, 4, 6, 3])) Scenario 2: shard-index gap detection in backend config (Go) Background: detect smallest missing shard number in an unsorted config list.\nWhy it fits: linear-time check during startup validation.\npackage main import \u0026#34;fmt\u0026#34; func firstMissingPositive(nums []int) int { n := len(nums) i := 0 for i \u0026lt; n { v := nums[i] if v \u0026gt;= 1 \u0026amp;\u0026amp; v \u0026lt;= n \u0026amp;\u0026amp; nums[v-1] != v { nums[i], nums[v-1] = nums[v-1], nums[i] } else { i++ } } for i, v := range nums { if v != i+1 { return i + 1 } } return n + 1 } func main() { fmt.Println(firstMissingPositive([]int{3, 4, -1, 1})) } Scenario 3: front-end task sequence gap finder (JavaScript) Background: allocate the smallest missing positive sequence number on client side.\nWhy it fits: no dependency on backend call or extra cache.\nfunction firstMissingPositive(nums) { const n = nums.length; let i = 0; while (i \u0026lt; n) { const v = nums[i]; if (v \u0026gt;= 1 \u0026amp;\u0026amp; v \u0026lt;= n \u0026amp;\u0026amp; nums[v - 1] !== v) { const tmp = nums[v - 1]; nums[v - 1] = nums[i]; nums[i] = tmp; } else { i += 1; } } for (let j = 0; j \u0026lt; n; j += 1) { if (nums[j] !== j + 1) return j + 1; } return n + 1; } console.log(firstMissingPositive([1, 2, 0])); R - Reflection Complexity Time: O(n) Extra space: O(1) Alternative Comparison Method Time Space Issue brute-force candidate checks O(n^2) O(1) too slow sorting + scan O(n log n) O(1) / O(log n) violates linear-time target hash set O(n) O(n) violates constant-space target in-place index placement O(n) O(1) best fit for constraints Common Mistakes Forgetting duplicate guard nums[v-1] != v, causing endless swaps Advancing i after every swap (should re-check new value at same index) Trying to place values outside [1, n] Returning wrong fallback (should be n + 1 when all slots match) Why this method is practically optimal It converts strict constraints into structure:\nno extra memory allocations linear scan + bounded swapping deterministic and template-friendly for array normalization tasks FAQ and Notes Why ignore non-positive values and values \u0026gt; n?\nThey cannot be the first missing positive in [1, n+1].\nWhy can answer be n+1?\nIf all 1..n exist, the smallest missing positive is next one.\nWill duplicates break correctness?\nNo, as long as duplicate guard is present before swap.\nCan this be done without swaps?\nThere are marking-based variants, but swap placement is the most direct O(1)-space template.\nBest Practices Memorize the placement condition as one line: 1 \u0026lt;= v \u0026lt;= n \u0026amp;\u0026amp; nums[v-1] != v Keep two phases explicit: placement then scan Add tests for edge cases: all negatives already continuous [1..n] duplicates empty array Prefer stable variable names (v, n, i) for readability in pointer-style loops S - Summary First Missing Positive is an index-placement problem under strict constraints In-place hashing maps value x to slot x-1 Placement + validation scan gives O(n)/O(1) Duplicate guard and loop control are the two bug hotspots This pattern is highly reusable for array normalization tasks Recommended Follow-up LeetCode 448 — Find All Numbers Disappeared in an Array LeetCode 442 — Find All Duplicates in an Array LeetCode 287 — Find the Duplicate Number Cyclic sort / index placement pattern notes Conclusion Once you internalize \u0026ldquo;place each value to its index slot, then scan first mismatch\u0026rdquo;, LeetCode 41 becomes a reusable engineering pattern rather than an interview trick.\nReferences https://leetcode.com/problems/first-missing-positive/ https://en.cppreference.com/w/cpp/algorithm/swap https://docs.python.org/3/library/stdtypes.html#list https://go.dev/doc/effective_go Meta Info Reading time: 12-15 min Tags: Hot100, array, in-place hashing SEO keywords: First Missing Positive, in-place hashing, index mapping, LeetCode 41 Meta description: O(n)/O(1) first missing positive with in-place index placement and linear validation. Call To Action (CTA) Do this drill sequence now:\nRe-implement 41 from memory with duplicate guard Adapt same idea to 448 (missing numbers) Compare swap-based placement and sign-marking variants Multi-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import List def first_missing_positive(nums: List[int]) -\u0026gt; int: n = len(nums) i = 0 while i \u0026lt; n: v = nums[i] if 1 \u0026lt;= v \u0026lt;= n and nums[v - 1] != v: nums[i], nums[v - 1] = nums[v - 1], nums[i] else: i += 1 for i, v in enumerate(nums): if v != i + 1: return i + 1 return n + 1 int firstMissingPositive(int* nums, int numsSize) { int i = 0; while (i \u0026lt; numsSize) { int v = nums[i]; if (v \u0026gt;= 1 \u0026amp;\u0026amp; v \u0026lt;= numsSize \u0026amp;\u0026amp; nums[v - 1] != v) { int tmp = nums[v - 1]; nums[v - 1] = nums[i]; nums[i] = tmp; } else { i++; } } for (i = 0; i \u0026lt; numsSize; ++i) { if (nums[i] != i + 1) return i + 1; } return numsSize + 1; } class Solution { public: int firstMissingPositive(vector\u0026lt;int\u0026gt;\u0026amp; nums) { int n = (int)nums.size(); int i = 0; while (i \u0026lt; n) { int v = nums[i]; if (v \u0026gt;= 1 \u0026amp;\u0026amp; v \u0026lt;= n \u0026amp;\u0026amp; nums[v - 1] != v) { swap(nums[i], nums[v - 1]); } else { i++; } } for (int i = 0; i \u0026lt; n; ++i) { if (nums[i] != i + 1) return i + 1; } return n + 1; } }; func firstMissingPositive(nums []int) int { n := len(nums) i := 0 for i \u0026lt; n { v := nums[i] if v \u0026gt;= 1 \u0026amp;\u0026amp; v \u0026lt;= n \u0026amp;\u0026amp; nums[v-1] != v { nums[i], nums[v-1] = nums[v-1], nums[i] } else { i++ } } for i, v := range nums { if v != i+1 { return i + 1 } } return n + 1 } pub fn first_missing_positive(nums: \u0026amp;mut Vec\u0026lt;i32\u0026gt;) -\u0026gt; i32 { let n = nums.len(); let mut i = 0usize; while i \u0026lt; n { let v = nums[i]; if v \u0026gt;= 1 \u0026amp;\u0026amp; (v as usize) \u0026lt;= n \u0026amp;\u0026amp; nums[(v - 1) as usize] != v { nums.swap(i, (v - 1) as usize); } else { i += 1; } } for (i, v) in nums.iter().enumerate() { if *v != (i as i32) + 1 { return (i as i32) + 1; } } (n as i32) + 1 } function firstMissingPositive(nums) { const n = nums.length; let i = 0; while (i \u0026lt; n) { const v = nums[i]; if (v \u0026gt;= 1 \u0026amp;\u0026amp; v \u0026lt;= n \u0026amp;\u0026amp; nums[v - 1] !== v) { [nums[i], nums[v - 1]] = [nums[v - 1], nums[i]]; } else { i += 1; } } for (let i = 0; i \u0026lt; n; i += 1) { if (nums[i] !== i + 1) return i + 1; } return n + 1; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/41-first-missing-positive/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nFirst Missing Positive is a classic in-place indexing problem. Place each valid value \u003ccode\u003ex\u003c/code\u003e into slot \u003ccode\u003ex-1\u003c/code\u003e, then scan for the first mismatch. This ACERS guide explains the derivation, invariant, pitfalls, and production-style transfer.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003earray\u003c/code\u003e, \u003ccode\u003ein-place hashing\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: First Missing Positive, in-place hashing, index mapping, O(n), Hot100, LeetCode 41\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: O(n)/O(1) solution for First Missing Positive using in-place index placement, with complexity analysis, engineering scenarios, and runnable multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners building stable array templates\u003c/li\u003e\n\u003cli\u003eIntermediate developers who want to master in-place indexing techniques\u003c/li\u003e\n\u003cli\u003eEngineers who need linear-time, constant-space array normalization\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003e\u0026ldquo;Find the smallest missing positive\u0026rdquo; is fundamentally a \u003cstrong\u003eplacement\u003c/strong\u003e problem.\u003c/p\u003e","title":"Hot100: First Missing Positive In-Place Index Placement ACERS Guide"},{"content":" Subtitle / Summary\nLeetCode 25 is the upgrade path from 206 (full reversal) and 92 (interval reversal): split by groups, reverse inside each full group, reconnect safely, and keep the last incomplete group unchanged.\nReading time: 14-18 min Tags: Hot100, linked list, group reversal, dummy node SEO keywords: Reverse Nodes in k-Group, group reversal, LeetCode 25, Hot100 Meta description: In-place k-group linked-list reversal with dummy-node anchoring and safe reconnection, including pitfalls, complexity, and runnable multi-language implementations. Target Readers Hot100 learners who already finished 206/92 and want the next linked-list jump Developers who often fail at boundary handling and reconnection in pointer-heavy tasks Engineers building reusable templates for chunk-based list transformation Background / Motivation In real systems, chain structures are often processed in batches:\nreplay compensation tasks per fixed batch size local reorder of pipeline nodes by chunk in-place structure rewrite without reallocating all nodes These tasks require:\ntransformation inside each batch stable order between batches clear rule for incomplete tail batches (keep unchanged) LeetCode 25 models exactly this requirement.\nCore Concepts Dummy node: removes head-special branches groupPrev: predecessor of the current group kth probe: checks whether a full group of size k exists groupNext: first node after the current group In-place group reversal: reverse only [groupStart, kth] A - Algorithm (Problem and Algorithm) Problem Restatement Given the head of a linked list and an integer k, reverse nodes in groups of size k and return the modified head.\nIf the number of remaining nodes is less than k, keep them in original order.\nYou must modify pointers, not just node values.\nInput / Output Name Type Description head ListNode head of singly linked list k int group size (k \u0026gt;= 1) return ListNode new head after group-wise reversal Example 1 input: head = 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5, k = 2 output: 2 -\u0026gt; 1 -\u0026gt; 4 -\u0026gt; 3 -\u0026gt; 5 Example 2 input: head = 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5, k = 3 output: 3 -\u0026gt; 2 -\u0026gt; 1 -\u0026gt; 4 -\u0026gt; 5 Thought Process: From Naive to Optimal Naive approach: array conversion and rebuild Convert linked list to array Reverse every full k-sized segment Rebuild list from array Problems:\nO(n) extra memory violates pointer-modification intent in many interview/engineering contexts Key observation The process can be decomposed into a repeating loop:\nverify current group has k nodes reverse this group in-place reconnect and move to next group This is \u0026ldquo;interval reversal\u0026rdquo; repeated under group control.\nMethod selection Use:\ndummy + groupPrev to anchor global structure kth scanning to validate group completeness in-place pointer reversal per full group Result:\nO(n) time O(1) extra space C - Concepts (Core Ideas) Method Category In-place linked-list rewiring Chunk/batch processing Boundary locating with predecessor + kth probe Loop Invariant At the start of each iteration:\ngroupPrev.next points to the first node of the next unprocessed group all nodes before groupPrev are already in final form After one successful group operation:\ncurrent group is reversed and reconnected correctly groupPrev moves to the new tail of this reversed group (old group head) This guarantees stable forward progress without breaking previous groups.\nStructure Sketch (k=3) dummy -\u0026gt; a -\u0026gt; b -\u0026gt; c -\u0026gt; d -\u0026gt; e -\u0026gt; f -\u0026gt; g ^ ^ groupStart kth reverse [a,b,c] =\u0026gt; dummy -\u0026gt; c -\u0026gt; b -\u0026gt; a -\u0026gt; d -\u0026gt; e -\u0026gt; f -\u0026gt; g ^ new groupPrev Practical Guide / Steps Initialize dummy.next = head, set groupPrev = dummy Scan k steps from groupPrev to find kth if missing, stop (tail group is incomplete) Save groupNext = kth.next Reverse [groupPrev.next, kth] with prev = groupNext Reconnect: groupPrev.next -\u0026gt; new group head old group head becomes new group tail Move groupPrev to new group tail and continue Runnable Example (Python) from typing import Optional class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reverse_k_group(head: Optional[ListNode], k: int) -\u0026gt; Optional[ListNode]: if not head or k \u0026lt;= 1: return head dummy = ListNode(0, head) group_prev = dummy while True: kth = group_prev for _ in range(k): kth = kth.next if not kth: return dummy.next group_next = kth.next prev = group_next cur = group_prev.next while cur != group_next: nxt = cur.next cur.next = prev prev = cur cur = nxt new_group_tail = group_prev.next group_prev.next = prev group_prev = new_group_tail def build(nums): dummy = ListNode() tail = dummy for x in nums: tail.next = ListNode(x) tail = tail.next return dummy.next def to_list(head): ans = [] while head: ans.append(head.val) head = head.next return ans if __name__ == \u0026#34;__main__\u0026#34;: h = build([1, 2, 3, 4, 5]) print(to_list(reverse_k_group(h, 2))) # [2, 1, 4, 3, 5] Explanation (Why This Works) groupPrev acts as a stable anchor before the current group.\nInside each group, set prev = groupNext before reversal, so the reversed tail automatically reconnects to the next segment.\nThis removes extra branch logic and keeps each group operation structurally identical.\nE - Engineering (Real-world Scenarios) Scenario 1: Batch compensation-chain replay (Go) Background: reverse order inside each fixed-size task batch.\nWhy it fits: in-place, bounded memory, clear tail behavior.\npackage main type Node struct { Val int Next *Node } func reverseKGroup(head *Node, k int) *Node { if head == nil || k \u0026lt;= 1 { return head } dummy := \u0026amp;Node{Next: head} groupPrev := dummy for { kth := groupPrev for i := 0; i \u0026lt; k; i++ { kth = kth.Next if kth == nil { return dummy.Next } } groupNext := kth.Next prev, cur := groupNext, groupPrev.Next for cur != groupNext { nxt := cur.Next cur.Next = prev prev = cur cur = nxt } tail := groupPrev.Next groupPrev.Next = prev groupPrev = tail } } Scenario 2: Segment rollback in event chains (Python) Background: replay/rollback events by fixed group windows.\nWhy it fits: exact group control and deterministic pointer behavior.\n# Reuse reverse_k_group(head, k) from above. Scenario 3: Workflow editor chunk reversal (JavaScript) Background: UI supports \u0026ldquo;reverse every k selected nodes\u0026rdquo;.\nWhy it fits: efficient in-memory operation without rebuilding the whole chain.\nfunction reverseKGroup(head, k) { if (!head || k \u0026lt;= 1) return head; const dummy = { val: 0, next: head }; let groupPrev = dummy; while (true) { let kth = groupPrev; for (let i = 0; i \u0026lt; k; i += 1) { kth = kth.next; if (!kth) return dummy.next; } const groupNext = kth.next; let prev = groupNext; let cur = groupPrev.next; while (cur !== groupNext) { const nxt = cur.next; cur.next = prev; prev = cur; cur = nxt; } const newGroupTail = groupPrev.next; groupPrev.next = prev; groupPrev = newGroupTail; } } R - Reflection Complexity Analysis Time: O(n)\nEach node is visited and rewired a constant number of times. Space: O(1)\nOnly constant pointer variables are used. Alternatives and Tradeoffs Method Time Space Notes array conversion O(n) O(n) simpler, but not in-place recursive group reversal O(n) O(n/k)~O(n) stack elegant but stack-risky iterative in-place group reversal O(n) O(1) production-friendly default Common Mistakes not checking whether kth exists before reversing forgetting to move groupPrev after each group (can cause infinite loops) reversing incomplete tail group (violates problem requirement) swapping values instead of rewiring nodes Why this approach is optimal in practice linear time constant memory explicit and testable boundary control It is the most stable template for interview and production-style pointer work.\nFAQ and Notes What if k = 1?\nReturn original list.\nWhat if list length is not divisible by k?\nKeep the final incomplete group unchanged.\nCan recursion solve this problem?\nYes, but iterative form is safer for stack depth and usually easier to debug.\nHow is this related to 92?\nLeetCode 25 is repeated interval reversal (92) under fixed-size grouping.\nBest Practices Memorize the pattern: dummy -\u0026gt; find kth -\u0026gt; reverse group -\u0026gt; reconnect -\u0026gt; move groupPrev Log groupPrev, kth, groupNext during debugging Validate with edge sets: k=1, k=2, k=len, k\u0026gt;len Review together with 206 and 92 as a linked-list reversal trilogy S - Summary LeetCode 25 is group-driven interval reversal Dummy node unifies head-boundary handling kth probing is the correctness gate before each reversal In-place rewiring achieves O(n)/O(1) This template generalizes to many advanced list reordering tasks Recommended Follow-up LeetCode 206 — Reverse Linked List LeetCode 92 — Reverse Linked List II LeetCode 24 — Swap Nodes in Pairs LeetCode 143 — Reorder List Conclusion Once \u0026ldquo;k-group scan + local reversal + safe reconnection\u0026rdquo; becomes your default pattern, LeetCode 25 becomes predictable pointer engineering instead of fragile pointer gymnastics.\nReferences https://leetcode.com/problems/reverse-nodes-in-k-group/ https://en.cppreference.com/w/cpp/container/forward_list https://doc.rust-lang.org/book/ch15-01-box.html https://go.dev/doc/effective_go Meta Info Reading time: 14-18 min Tags: Hot100, linked list, group reversal, dummy node SEO keywords: Reverse Nodes in k-Group, LeetCode 25, Hot100 Meta description: O(n)/O(1) in-place k-group linked-list reversal with robust boundary handling. Call To Action (CTA) Do this practice loop now:\nRe-implement 25 from memory without looking Adapt same structure to 24 (pair swap) Compare with 92 to internalize single-interval vs repeated-group control Multi-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import Optional class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reverse_k_group(head: Optional[ListNode], k: int) -\u0026gt; Optional[ListNode]: if not head or k \u0026lt;= 1: return head dummy = ListNode(0, head) group_prev = dummy while True: kth = group_prev for _ in range(k): kth = kth.next if not kth: return dummy.next group_next = kth.next prev = group_next cur = group_prev.next while cur != group_next: nxt = cur.next cur.next = prev prev = cur cur = nxt new_group_tail = group_prev.next group_prev.next = prev group_prev = new_group_tail #include \u0026lt;stdlib.h\u0026gt; typedef struct ListNode { int val; struct ListNode *next; } ListNode; ListNode* reverseKGroup(ListNode* head, int k) { if (!head || k \u0026lt;= 1) return head; ListNode dummy; dummy.val = 0; dummy.next = head; ListNode* groupPrev = \u0026amp;dummy; while (1) { ListNode* kth = groupPrev; for (int i = 0; i \u0026lt; k; ++i) { kth = kth-\u0026gt;next; if (!kth) return dummy.next; } ListNode* groupNext = kth-\u0026gt;next; ListNode* prev = groupNext; ListNode* cur = groupPrev-\u0026gt;next; while (cur != groupNext) { ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } ListNode* newGroupTail = groupPrev-\u0026gt;next; groupPrev-\u0026gt;next = prev; groupPrev = newGroupTail; } } struct ListNode { int val; ListNode* next; ListNode(int x) : val(x), next(nullptr) {} }; class Solution { public: ListNode* reverseKGroup(ListNode* head, int k) { if (!head || k \u0026lt;= 1) return head; ListNode dummy(0); dummy.next = head; ListNode* groupPrev = \u0026amp;dummy; while (true) { ListNode* kth = groupPrev; for (int i = 0; i \u0026lt; k; ++i) { kth = kth-\u0026gt;next; if (!kth) return dummy.next; } ListNode* groupNext = kth-\u0026gt;next; ListNode* prev = groupNext; ListNode* cur = groupPrev-\u0026gt;next; while (cur != groupNext) { ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } ListNode* newGroupTail = groupPrev-\u0026gt;next; groupPrev-\u0026gt;next = prev; groupPrev = newGroupTail; } } }; package main type ListNode struct { Val int Next *ListNode } func reverseKGroup(head *ListNode, k int) *ListNode { if head == nil || k \u0026lt;= 1 { return head } dummy := \u0026amp;ListNode{Next: head} groupPrev := dummy for { kth := groupPrev for i := 0; i \u0026lt; k; i++ { kth = kth.Next if kth == nil { return dummy.Next } } groupNext := kth.Next prev, cur := groupNext, groupPrev.Next for cur != groupNext { nxt := cur.Next cur.Next = prev prev = cur cur = nxt } newGroupTail := groupPrev.Next groupPrev.Next = prev groupPrev = newGroupTail } } #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] fn new(val: i32) -\u0026gt; Self { ListNode { next: None, val } } } pub fn reverse_k_group(head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, k: i32) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let k = k as usize; if k \u0026lt;= 1 { return head; } let mut dummy = Box::new(ListNode { val: 0, next: head }); let mut group_prev: \u0026amp;mut Box\u0026lt;ListNode\u0026gt; = \u0026amp;mut dummy; loop { let mut check = group_prev.next.as_ref(); for _ in 0..k { match check { Some(node) =\u0026gt; check = node.next.as_ref(), None =\u0026gt; return dummy.next, } } let mut cur = group_prev.next.take(); let mut rev: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; = None; for _ in 0..k { let mut node = cur.unwrap(); cur = node.next.take(); node.next = rev; rev = Some(node); } group_prev.next = rev; for _ in 0..k { group_prev = group_prev.next.as_mut().unwrap(); } group_prev.next = cur; } } function ListNode(val, next = null) { this.val = val; this.next = next; } function reverseKGroup(head, k) { if (!head || k \u0026lt;= 1) return head; const dummy = new ListNode(0, head); let groupPrev = dummy; while (true) { let kth = groupPrev; for (let i = 0; i \u0026lt; k; i += 1) { kth = kth.next; if (!kth) return dummy.next; } const groupNext = kth.next; let prev = groupNext; let cur = groupPrev.next; while (cur !== groupNext) { const nxt = cur.next; cur.next = prev; prev = cur; cur = nxt; } const newGroupTail = groupPrev.next; groupPrev.next = prev; groupPrev = newGroupTail; } } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/25-reverse-nodes-in-k-group/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nLeetCode 25 is the upgrade path from 206 (full reversal) and 92 (interval reversal): split by groups, reverse inside each full group, reconnect safely, and keep the last incomplete group unchanged.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 14-18 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003egroup reversal\u003c/code\u003e, \u003ccode\u003edummy node\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Reverse Nodes in k-Group, group reversal, LeetCode 25, Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: In-place k-group linked-list reversal with dummy-node anchoring and safe reconnection, including pitfalls, complexity, and runnable multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who already finished 206/92 and want the next linked-list jump\u003c/li\u003e\n\u003cli\u003eDevelopers who often fail at boundary handling and reconnection in pointer-heavy tasks\u003c/li\u003e\n\u003cli\u003eEngineers building reusable templates for chunk-based list transformation\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eIn real systems, chain structures are often processed in batches:\u003c/p\u003e","title":"Hot100: Reverse Nodes in k-Group Group-Wise In-Place ACERS Guide"},{"content":" Subtitle / Summary\nReorder List is a classic pointer choreography problem: find middle, reverse second half, then merge alternately. This guide derives the in-place O(n)/O(1) method from naive ideas and turns it into a reusable Hot100 template.\nReading time: 12-15 min Tags: Hot100, linked list, in-place SEO keywords: Reorder List, split reverse merge, LeetCode 143, O(1) space Meta description: A full ACERS explanation of Reorder List with correctness intuition, boundary handling, engineering mapping, and runnable code in Python/C/C++/Go/Rust/JS. Target Readers Hot100 learners who want a stable linked-list rewire template Developers who can reverse lists but still fail on alternating merge details Engineers preparing for interviews where O(1) extra space is required Background / Motivation At first glance, this looks like simple reordering. In reality, it tests whether you can safely perform three dependent pointer operations in one workflow:\nSplit one list into two valid lists Reverse one half in-place Alternate-merge without cycles or node loss Most bugs come from boundary handling and pointer update order, not from \u0026ldquo;algorithmic complexity\u0026rdquo; itself.\nCore Concepts Target order: L0 -\u0026gt; Ln -\u0026gt; L1 -\u0026gt; Ln-1 -\u0026gt; L2 -\u0026gt; ... In-place constraint: do not allocate a new list Three-phase pipeline: middle split (slow/fast) reverse second half alternating merge Critical safety rule: cut first half tail (slow.next = null) before merge A - Algorithm (Problem and Algorithm) Problem Restatement Given the head of a singly linked list head, reorder it to:\nL0 -\u0026gt; Ln -\u0026gt; L1 -\u0026gt; Ln-1 -\u0026gt; L2 -\u0026gt; Ln-2 -\u0026gt; ...\nConstraints:\nNode values must remain unchanged You can only rewire next pointers Input / Output Name Type Description head ListNode Head of the list return void Reorder in-place (head remains first node) Example 1 input: 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 output: 1 -\u0026gt; 4 -\u0026gt; 2 -\u0026gt; 3 Example 2 input: 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5 output: 1 -\u0026gt; 5 -\u0026gt; 2 -\u0026gt; 4 -\u0026gt; 3 Thought Process: From Naive to In-Place Optimal Naive idea 1: copy to array, rebuild by two pointers Put all nodes into an array Use i from left and j from right Reconnect in target order This is straightforward, but costs O(n) extra memory.\nNaive idea 2: repeatedly find tail and splice Keep taking tail and inserting after current head-side node This becomes O(n^2), too slow for larger inputs.\nKey observation The target sequence always alternates:\nfirst half in natural order second half in reverse order So the right decomposition is:\nsplit at middle reverse right half once weave two lists alternately This gives O(n) time and O(1) extra space.\nC - Concepts (Core Ideas) Method category Linked-list pointer manipulation Two-pointer middle search In-place reversal + zipper merge Phase invariants Split phase: after cut, left and right are independent lists Reverse phase: prev is always head of reversed prefix Merge phase: each iteration appends exactly one node from right into left chain Why the order is correct Left half: L0, L1, L2, ... Reversed right half: Ln, Ln-1, ... Alternating merge produces exactly: L0, Ln, L1, Ln-1, ... No node is duplicated because each node moves from one source list once.\nPractice Guide / Steps Handle trivial lists (0/1/2 nodes): already ordered Use slow/fast to find middle Let second = slow.next, then slow.next = null Reverse second Merge first and second alternately Runnable Python example (reorder_list.py):\nclass ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reorder_list(head): if head is None or head.next is None: return # 1) find middle slow, fast = head, head while fast.next and fast.next.next: slow = slow.next fast = fast.next.next # 2) split and reverse second half second = slow.next slow.next = None prev = None cur = second while cur: nxt = cur.next cur.next = prev prev = cur cur = nxt second = prev # 3) merge two lists alternately first = head while second: n1 = first.next n2 = second.next first.next = second second.next = n1 first = n1 if n1 else second second = n2 def from_list(arr): dummy = ListNode() tail = dummy for x in arr: tail.next = ListNode(x) tail = tail.next return dummy.next def to_list(head): ans = [] while head: ans.append(head.val) head = head.next return ans if __name__ == \u0026#34;__main__\u0026#34;: h = from_list([1, 2, 3, 4, 5]) reorder_list(h) print(to_list(h)) # [1, 5, 2, 4, 3] Explanation / Why This Works The whole method depends on one strict ordering:\nfind split point cut list reverse right half weave If you skip step 2 (cut), merge often creates cycles because old links remain active.\nIf merge updates pointers in wrong order, nodes are lost. Always save both n1 and n2 before rewiring.\nE - Engineering (Real-world Scenarios) Scenario 1: Feed interleaving by freshness and baseline priority (Python) Background: a timeline combines old baseline-ranked items and newest items. Why it fits: alternating merge after reversing one side approximates \u0026ldquo;front-back interleaving\u0026rdquo; cheaply.\ndef interleave_ids(ids): left = ids[: (len(ids) + 1) // 2] right = list(reversed(ids[(len(ids) + 1) // 2 :])) out = [] i = j = 0 while i \u0026lt; len(left) or j \u0026lt; len(right): if i \u0026lt; len(left): out.append(left[i]); i += 1 if j \u0026lt; len(right): out.append(right[j]); j += 1 return out print(interleave_ids([1, 2, 3, 4, 5])) # [1, 5, 2, 4, 3] Scenario 2: Linked-list task queue reshaping in backend service (Go) Background: a queue represented as linked nodes needs deterministic reshaping without allocating new nodes. Why it fits: split-reverse-merge is O(1) extra memory and predictable.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Val int Next *Node } func build(a []int) *Node { dummy := \u0026amp;Node{} p := dummy for _, x := range a { p.Next = \u0026amp;Node{Val: x} p = p.Next } return dummy.Next } func toSlice(h *Node) []int { ans := []int{} for h != nil { ans = append(ans, h.Val) h = h.Next } return ans } func main() { head := build([]int{1, 2, 3, 4}) // production code would call reorderList(head) fmt.Println(toSlice(head)) } Scenario 3: Frontend card stream alternating recent and legacy blocks (JavaScript) Background: a client-side list needs quick visual alternation for A/B display experiments. Why it fits: same structural pattern can be reused on arrays.\nfunction reorderArray(arr) { const left = arr.slice(0, Math.ceil(arr.length / 2)); const right = arr.slice(Math.ceil(arr.length / 2)).reverse(); const out = []; let i = 0; let j = 0; while (i \u0026lt; left.length || j \u0026lt; right.length) { if (i \u0026lt; left.length) out.push(left[i++]); if (j \u0026lt; right.length) out.push(right[j++]); } return out; } console.log(reorderArray([1, 2, 3, 4, 5])); // [1, 5, 2, 4, 3] R - Reflection (Complexity, Alternatives, Pitfalls) Complexity Time: O(n) Extra space: O(1) Alternatives and tradeoffs Method Time Extra Space Notes Array rebuild O(n) O(n) Easy but violates strict in-place goal Repeated tail extraction O(n^2) O(1) Too slow Split + reverse + merge O(n) O(1) Best practical template Common mistakes Forgetting to cut at middle (slow.next = null) causing cycles Wrong middle condition, especially for even length Merge order bug: rewiring before saving next pointers Not handling short lists (null or single node) Why this is the optimal practical method It matches constraints exactly:\nin-place linear time no extra container And each sub-step is reusable across multiple linked-list problems.\nFAQ and Notes Why is this not a palindrome-like compare problem?\nBecause we must physically reorder links, not just compare values.\nCan recursion solve it cleanly?\nYes in theory, but stack usage becomes O(n) and implementation is trickier.\nDo node values matter?\nNo. This is pointer topology, not value sorting.\nBest Practices Treat split/reverse/merge as three isolated templates, then compose Write and reuse helper functions in production code Validate with odd/even lengths and minimal lists Use pointer diagrams when debugging merge order S - Summary Reorder List is solved by split -\u0026gt; reverse -\u0026gt; alternating merge The key optimization is recognizing \u0026ldquo;second half reversed\u0026rdquo; as required shape Correctness depends on strict pointer update order and middle cut This template is foundational for many advanced linked-list problems Further Reading LeetCode 143. Reorder List LeetCode 206. Reverse Linked List LeetCode 234. Palindrome Linked List LeetCode 25. Reverse Nodes in k-Group Conclusion If you can implement this problem without pointer bugs under interview pressure, your linked-list manipulation skill is already at a solid intermediate level. The same split/reverse/merge workflow appears repeatedly in production-grade list transformations.\nReferences https://leetcode.com/problems/reorder-list/ https://en.cppreference.com/w/cpp/container/forward_list https://doc.rust-lang.org/std/option/ Meta Info Reading time: 12-15 min Tags: Hot100, linked list, in-place, two pointers SEO keywords: Reorder List, LeetCode 143, split reverse merge, O(1) space Meta description: In-place O(n)/O(1) linked-list reordering with derivation, pitfalls, and multi-language implementations. CTA Try coding this from scratch in 15 minutes without looking at notes. Then extend it to: reverse k-group and palindrome linked list to lock in the pointer templates.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reorder_list(head): if head is None or head.next is None: return slow, fast = head, head while fast.next and fast.next.next: slow = slow.next fast = fast.next.next second = slow.next slow.next = None prev = None cur = second while cur: nxt = cur.next cur.next = prev prev = cur cur = nxt second = prev first = head while second: n1 = first.next n2 = second.next first.next = second second.next = n1 first = n1 if n1 else second second = n2 #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef struct ListNode { int val; struct ListNode* next; } ListNode; void reorderList(ListNode* head) { if (!head || !head-\u0026gt;next) return; ListNode *slow = head, *fast = head; while (fast-\u0026gt;next \u0026amp;\u0026amp; fast-\u0026gt;next-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; } ListNode* second = slow-\u0026gt;next; slow-\u0026gt;next = NULL; ListNode *prev = NULL, *cur = second; while (cur) { ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } second = prev; ListNode* first = head; while (second) { ListNode* n1 = first-\u0026gt;next; ListNode* n2 = second-\u0026gt;next; first-\u0026gt;next = second; second-\u0026gt;next = n1; first = n1 ? n1 : second; second = n2; } } #include \u0026lt;iostream\u0026gt; struct ListNode { int val; ListNode* next; ListNode(int v) : val(v), next(nullptr) {} }; class Solution { public: void reorderList(ListNode* head) { if (!head || !head-\u0026gt;next) return; ListNode *slow = head, *fast = head; while (fast-\u0026gt;next \u0026amp;\u0026amp; fast-\u0026gt;next-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; } ListNode* second = slow-\u0026gt;next; slow-\u0026gt;next = nullptr; ListNode* prev = nullptr; while (second) { ListNode* nxt = second-\u0026gt;next; second-\u0026gt;next = prev; prev = second; second = nxt; } second = prev; ListNode* first = head; while (second) { ListNode* n1 = first-\u0026gt;next; ListNode* n2 = second-\u0026gt;next; first-\u0026gt;next = second; second-\u0026gt;next = n1; first = n1 ? n1 : second; second = n2; } } }; package main type ListNode struct { Val int Next *ListNode } func reorderList(head *ListNode) { if head == nil || head.Next == nil { return } slow, fast := head, head for fast.Next != nil \u0026amp;\u0026amp; fast.Next.Next != nil { slow = slow.Next fast = fast.Next.Next } second := slow.Next slow.Next = nil var prev *ListNode cur := second for cur != nil { nxt := cur.Next cur.Next = prev prev = cur cur = nxt } second = prev first := head for second != nil { n1 := first.Next n2 := second.Next first.Next = second second.Next = n1 if n1 != nil { first = n1 } else { first = second } second = n2 } } use std::collections::VecDeque; #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] fn new(val: i32) -\u0026gt; Self { ListNode { val, next: None } } } // Safe Rust variant: uses O(n) extra deque to avoid unsafe pointer rewiring. pub fn reorder_list(head: \u0026amp;mut Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;) { let mut cur = head.take(); let mut dq: VecDeque\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; = VecDeque::new(); while let Some(mut node) = cur { cur = node.next.take(); dq.push_back(node); } let mut reordered: Vec\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; = Vec::with_capacity(dq.len()); let mut pick_front = true; while !dq.is_empty() { if pick_front { reordered.push(dq.pop_front().unwrap()); } else { reordered.push(dq.pop_back().unwrap()); } pick_front = !pick_front; } let mut new_head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; = None; for mut node in reordered.into_iter().rev() { node.next = new_head; new_head = Some(node); } *head = new_head; } function ListNode(val, next = null) { this.val = val; this.next = next; } function reorderList(head) { if (!head || !head.next) return; let slow = head; let fast = head; while (fast.next \u0026amp;\u0026amp; fast.next.next) { slow = slow.next; fast = fast.next.next; } let second = slow.next; slow.next = null; let prev = null; while (second) { const nxt = second.next; second.next = prev; prev = second; second = nxt; } second = prev; let first = head; while (second) { const n1 = first.next; const n2 = second.next; first.next = second; second.next = n1; first = n1 || second; second = n2; } } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/143-reorder-list/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nReorder List is a classic pointer choreography problem: find middle, reverse second half, then merge alternately. This guide derives the in-place O(n)/O(1) method from naive ideas and turns it into a reusable Hot100 template.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003ein-place\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Reorder List, split reverse merge, LeetCode 143, O(1) space\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: A full ACERS explanation of Reorder List with correctness intuition, boundary handling, engineering mapping, and runnable code in Python/C/C++/Go/Rust/JS.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want a stable linked-list rewire template\u003c/li\u003e\n\u003cli\u003eDevelopers who can reverse lists but still fail on alternating merge details\u003c/li\u003e\n\u003cli\u003eEngineers preparing for interviews where O(1) extra space is required\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eAt first glance, this looks like simple reordering.\nIn reality, it tests whether you can safely perform \u003cstrong\u003ethree dependent pointer operations\u003c/strong\u003e in one workflow:\u003c/p\u003e","title":"Hot100: Reorder List In-Place Split-Reverse-Merge ACERS Guide"},{"content":" Subtitle / Summary\nReverse Linked List II is not about full-list reversal; it is about reversing a strict middle interval while preserving both outer connections. This ACERS guide explains the dummy-node anchor, head-insertion loop, and boundary-safe implementation.\nReading time: 12-15 min Tags: Hot100, linked list, sublist reversal, dummy node SEO keywords: Reverse Linked List II, sublist reversal, dummy node, head insertion, LeetCode 92, Hot100 Meta description: In-place sublist reversal with dummy node + head insertion in O(n)/O(1), with correctness intuition, pitfalls, and runnable multi-language code. Target Readers Hot100 learners who already know 206 and want the interval version Developers who often fail at linked-list boundary handling (left = 1, right = n) Engineers building reusable pointer-rewiring templates Background / Motivation LeetCode 206 reverses the whole list. LeetCode 92 asks for a stricter operation:\nreverse only nodes from position left to right keep prefix and suffix connected correctly This pattern appears in engineering-style chain structures:\nreplaying a partial compensation chain in reverse order local reorder in an event sequence in-place transformation without allocating new nodes The hard part is not algorithmic complexity; it is pointer safety and boundary consistency.\nCore Concepts Dummy node: unifies the left = 1 case with all other cases Anchor predecessor prev: ends at node left - 1 (or dummy) Current tail cur: starts at prev.next and remains tail of reversed block during loop Head insertion: repeatedly detach cur.next and insert it right after prev A - Algorithm (Problem and Algorithm) Problem Restatement Given the head of a singly linked list and two integers left and right (1 \u0026lt;= left \u0026lt;= right \u0026lt;= n), reverse the nodes from position left to right, and return the new head.\nInput / Output Name Type Description head ListNode head of the singly linked list left int left boundary (1-based) right int right boundary (1-based) return ListNode head after sublist reversal Example 1 input: head = 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5, left = 2, right = 4 output: 1 -\u0026gt; 4 -\u0026gt; 3 -\u0026gt; 2 -\u0026gt; 5 Example 2 input: head = 5, left = 1, right = 1 output: 5 Thought Process: From Naive to In-Place Naive idea: array conversion Convert list to array Reverse array[left-1:right] Rebuild list (or rewrite values) Problems:\nO(n) extra memory may violate node-identity expectations in real systems Key observation You only need to rewire pointers inside the interval.\nAfter locating prev (node before left), run this repeatedly:\nnxt = cur.next detach nxt: cur.next = nxt.next insert nxt after prev: nxt.next = prev.next prev.next = nxt Repeat right - left times.\nC - Concepts (Core Ideas) Method Category In-place linked-list rewiring Sublist transformation Head-insertion loop Invariant (the correctness handle) After each iteration i (0 \u0026lt;= i \u0026lt;= right - left):\nprev still points to the predecessor of the reversing block prev.next is the head of the already reversed prefix of target block cur remains the tail of the reversed part and head of remaining unreversed part At loop end:\nsublist is reversed prefix and suffix are still connected Pointer Trace (left=2, right=4) start: 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5 ^ cur (prev=1) round 1 (move 3 after 1): 1 -\u0026gt; 3 -\u0026gt; 2 -\u0026gt; 4 -\u0026gt; 5 ^ cur round 2 (move 4 after 1): 1 -\u0026gt; 4 -\u0026gt; 3 -\u0026gt; 2 -\u0026gt; 5 ^ cur Practical Guide / Steps Create dummy, set dummy.next = head Move prev forward left - 1 steps Set cur = prev.next Run right - left rounds of head insertion Return dummy.next Runnable Example (Python) from typing import Optional class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reverse_between(head: Optional[ListNode], left: int, right: int) -\u0026gt; Optional[ListNode]: if not head or left == right: return head dummy = ListNode(0, head) prev = dummy for _ in range(left - 1): prev = prev.next cur = prev.next for _ in range(right - left): nxt = cur.next cur.next = nxt.next nxt.next = prev.next prev.next = nxt return dummy.next def build(nums): dummy = ListNode() tail = dummy for x in nums: tail.next = ListNode(x) tail = tail.next return dummy.next def to_list(head): out = [] while head: out.append(head.val) head = head.next return out if __name__ == \u0026#34;__main__\u0026#34;: h = build([1, 2, 3, 4, 5]) h = reverse_between(h, 2, 4) print(to_list(h)) # [1, 4, 3, 2, 5] Explanation (Why This Works) Treat prev as a fixed anchor before the target interval. Each loop extracts one node right after cur and places it right after prev. That operation grows reversed prefix at the front while keeping the rest linked.\nBenefits:\nNo segment split + re-join ceremony O(1) extra memory Uniform behavior on left = 1 due to dummy node E - Engineering (Real-world Scenarios) Scenario 1: Partial compensation-chain replay (Go) Background: reverse execution order for one segment of compensation tasks.\nWhy it fits: local reorder, identity-preserving nodes, constant memory.\npackage main type Node struct { Val int Next *Node } func reverseBetween(head *Node, left, right int) *Node { if head == nil || left == right { return head } dummy := \u0026amp;Node{Next: head} prev := dummy for i := 0; i \u0026lt; left-1; i++ { prev = prev.Next } cur := prev.Next for i := 0; i \u0026lt; right-left; i++ { nxt := cur.Next cur.Next = nxt.Next nxt.Next = prev.Next prev.Next = nxt } return dummy.Next } Scenario 2: Event-chain local rollback window (Python) Background: only a middle event window needs reverse replay.\nWhy it fits: precise interval operation without global rebuild.\n# Reuse reverse_between(head, left, right) from above. Scenario 3: Frontend node-flow local reorder (JavaScript) Background: workflow editor supports \u0026ldquo;reverse selected range\u0026rdquo;.\nWhy it fits: fast in-memory adjustment, predictable pointer operations.\nfunction reverseBetween(head, left, right) { if (!head || left === right) return head; const dummy = { val: 0, next: head }; let prev = dummy; for (let i = 0; i \u0026lt; left - 1; i += 1) prev = prev.next; const cur = prev.next; for (let i = 0; i \u0026lt; right - left; i += 1) { const nxt = cur.next; cur.next = nxt.next; nxt.next = prev.next; prev.next = nxt; } return dummy.next; } R - Reflection Complexity Time: O(n) Space: O(1) Work split:\nfind predecessor in left - 1 steps run right - left rewiring rounds Alternatives and Tradeoffs Approach Time Space Notes array conversion O(n) O(n) easy but not in-place cut/reverse/reconnect O(n) O(1) valid, more connection points dummy + head insertion O(n) O(1) concise, boundary-safe, reusable Common Mistakes skipping dummy node and breaking left=1 moving prev wrong number of steps wrong pointer update order causing chain loss comparing values instead of node references in linked-list logic Why this is the practical template single anchor (prev) single loop (right-left rounds) single return (dummy.next) Fewer branches, fewer failure points.\nFAQ and Notes What if left == right?\nReturn head directly.\nDo we need to validate right bounds?\nLeetCode guarantees valid input; production code should still validate.\nCan recursion solve this elegantly?\nYes, but recursion adds stack risk and usually increases boundary complexity.\nHow is this related to 206?\n206 is whole-list reversal; 92 is interval-scoped pointer rewiring built on the same reversal mindset.\nBest Practices Memorize the 4-line head-insertion block Always start from dummy Dry-run with a 5-node list before coding Test 4 boundary cases: left = 1 right = n left = right n = 1 S - Summary LeetCode 92 is an interval rewiring problem, not a value swap problem Dummy node removes head-special branching Head insertion gives O(1) extra-space reversal Invariants are the fastest way to reason about correctness This template transfers directly to advanced list reorder problems Recommended Follow-up LeetCode 206 — Reverse Linked List LeetCode 25 — Reverse Nodes in k-Group LeetCode 24 — Swap Nodes in Pairs LeetCode 143 — Reorder List Conclusion Once dummy + predecessor + head insertion becomes muscle memory, sublist reversal stops being pointer chaos and becomes predictable engineering work.\nReferences https://leetcode.com/problems/reverse-linked-list-ii/ https://en.cppreference.com/w/cpp/container/forward_list https://doc.rust-lang.org/book/ch15-01-box.html https://go.dev/doc/effective_go Meta Info Reading time: 12-15 min Tags: Hot100, linked list, sublist reversal, dummy node SEO keywords: Reverse Linked List II, sublist reversal, head insertion, LeetCode 92 Meta description: O(n)/O(1) in-place sublist reversal with dummy node and head insertion. Call To Action (CTA) Do two drills now:\nReimplement 92 from memory using the 4-line insertion block Move to 25 (k-group reversal) and compare the control flow Multi-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import Optional class ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reverse_between(head: Optional[ListNode], left: int, right: int) -\u0026gt; Optional[ListNode]: if not head or left == right: return head dummy = ListNode(0, head) prev = dummy for _ in range(left - 1): prev = prev.next cur = prev.next for _ in range(right - left): nxt = cur.next cur.next = nxt.next nxt.next = prev.next prev.next = nxt return dummy.next #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef struct ListNode { int val; struct ListNode *next; } ListNode; ListNode* reverseBetween(ListNode* head, int left, int right) { if (!head || left == right) return head; ListNode dummy; dummy.val = 0; dummy.next = head; ListNode* prev = \u0026amp;dummy; for (int i = 0; i \u0026lt; left - 1; ++i) prev = prev-\u0026gt;next; ListNode* cur = prev-\u0026gt;next; for (int i = 0; i \u0026lt; right - left; ++i) { ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = nxt-\u0026gt;next; nxt-\u0026gt;next = prev-\u0026gt;next; prev-\u0026gt;next = nxt; } return dummy.next; } #include \u0026lt;iostream\u0026gt; struct ListNode { int val; ListNode* next; ListNode(int x) : val(x), next(nullptr) {} }; ListNode* reverseBetween(ListNode* head, int left, int right) { if (!head || left == right) return head; ListNode dummy(0); dummy.next = head; ListNode* prev = \u0026amp;dummy; for (int i = 0; i \u0026lt; left - 1; ++i) prev = prev-\u0026gt;next; ListNode* cur = prev-\u0026gt;next; for (int i = 0; i \u0026lt; right - left; ++i) { ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = nxt-\u0026gt;next; nxt-\u0026gt;next = prev-\u0026gt;next; prev-\u0026gt;next = nxt; } return dummy.next; } package main type ListNode struct { Val int Next *ListNode } func reverseBetween(head *ListNode, left int, right int) *ListNode { if head == nil || left == right { return head } dummy := \u0026amp;ListNode{Next: head} prev := dummy for i := 0; i \u0026lt; left-1; i++ { prev = prev.Next } cur := prev.Next for i := 0; i \u0026lt; right-left; i++ { nxt := cur.Next cur.Next = nxt.Next nxt.Next = prev.Next prev.Next = nxt } return dummy.Next } #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] fn new(val: i32) -\u0026gt; Self { ListNode { next: None, val } } } pub fn reverse_between(head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, left: i32, right: i32) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { if left == right { return head; } let mut vals = Vec::new(); let mut cursor = head.as_ref(); while let Some(node) = cursor { vals.push(node.val); cursor = node.next.as_ref(); } let l = (left - 1) as usize; let r = (right - 1) as usize; vals[l..=r].reverse(); let mut dummy = Box::new(ListNode::new(0)); let mut tail = \u0026amp;mut dummy; for v in vals { tail.next = Some(Box::new(ListNode::new(v))); tail = tail.next.as_mut().unwrap(); } dummy.next } function ListNode(val, next = null) { this.val = val; this.next = next; } function reverseBetween(head, left, right) { if (!head || left === right) return head; const dummy = new ListNode(0, head); let prev = dummy; for (let i = 0; i \u0026lt; left - 1; i += 1) prev = prev.next; const cur = prev.next; for (let i = 0; i \u0026lt; right - left; i += 1) { const nxt = cur.next; cur.next = nxt.next; nxt.next = prev.next; prev.next = nxt; } return dummy.next; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/92-reverse-linked-list-ii/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nReverse Linked List II is not about full-list reversal; it is about reversing a strict middle interval while preserving both outer connections. This ACERS guide explains the dummy-node anchor, head-insertion loop, and boundary-safe implementation.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003esublist reversal\u003c/code\u003e, \u003ccode\u003edummy node\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Reverse Linked List II, sublist reversal, dummy node, head insertion, LeetCode 92, Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: In-place sublist reversal with dummy node + head insertion in O(n)/O(1), with correctness intuition, pitfalls, and runnable multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who already know 206 and want the interval version\u003c/li\u003e\n\u003cli\u003eDevelopers who often fail at linked-list boundary handling (\u003ccode\u003eleft = 1\u003c/code\u003e, \u003ccode\u003eright = n\u003c/code\u003e)\u003c/li\u003e\n\u003cli\u003eEngineers building reusable pointer-rewiring templates\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eLeetCode 206 reverses the whole list. LeetCode 92 asks for a stricter operation:\u003c/p\u003e","title":"Hot100: Reverse Linked List II Dummy Node + Head-Insertion ACERS Guide"},{"content":" Subtitle / Summary\nDetecting a cycle in a linked list is a pointer chasing problem, not a value comparison problem. This ACERS guide explains why Floyd’s fast/slow pointers must meet if a cycle exists, how to avoid null-pointer bugs, and how the same pattern maps to engineering checks.\nReading time: 10-12 min Tags: Hot100, linked list, fast slow pointers, Floyd SEO keywords: Linked List Cycle, Floyd, fast slow pointers, LeetCode 141, Hot100 Meta description: O(n)/O(1) cycle detection in singly linked lists using Floyd fast/slow pointers, with alternatives, common mistakes, and runnable multi-language code. Target Readers Hot100 learners and interview candidates Developers building reusable linked-list two-pointer templates Engineers who need to detect loops in chain-like structures Background / Motivation Cycle bugs are common in pointer-linked structures:\ntraversal never ends (infinite loop) cleanup/free logic hangs system looks \u0026ldquo;randomly stuck\u0026rdquo; while root cause is structural So we need a detection method that is:\nonline (single pass style) memory-light (no large side structure) robust under large lists Floyd fast/slow pointer detection is the standard solution for this profile.\nCore Concepts Cycle: from some node, following next can eventually return to itself pos in problem statement: only used by test data construction, not a function parameter Fast/slow pointers: slow moves 1 step fast moves 2 steps Node identity vs node value: compare pointer/reference identity, not val A - Algorithm (Problem and Algorithm) Problem Restatement Given head node head of a singly linked list, determine whether there is a cycle in the list. Return true if a cycle exists, else false.\nInput / Output Name Type Description head ListNode head of singly linked list (can be null) return bool whether cycle exists Example 1 head: 3 -\u0026gt; 2 -\u0026gt; 0 -\u0026gt; -4 ^ | |_____| output: true Example 2 head: 1 -\u0026gt; 2 -\u0026gt; null output: false Thought Process: From Hash Set to Floyd Naive approach: record visited nodes Traverse nodes and store each node reference in a set:\nseen again =\u0026gt; cycle reaches null =\u0026gt; no cycle Pros: easy to reason about. Cons: O(n) extra memory.\nKey observation If a cycle exists, once both pointers enter the cycle:\nfast gains 1 node per step over slow relative gap changes modulo cycle length gap must become 0 eventually So they must meet inside the cycle.\nFinal choice Floyd cycle detection:\ntime O(n) extra space O(1) C - Concepts (Core Ideas) Method Category Two pointers Floyd cycle detection (tortoise-hare) Online structural integrity check Why meeting is guaranteed (intuition) Let cycle length be L. After both pointers are in cycle, each round:\nslow +1 fast +2 Relative movement is +1 modulo L. So relative distance cycles through all residues and eventually becomes 0, meaning they meet.\nSafe loop condition Before fast = fast.next.next, we must ensure:\nfast != null fast.next != null Otherwise null dereference happens.\nPractical Guide / Steps Initialize slow = head, fast = head While fast != null \u0026amp;\u0026amp; fast.next != null: slow = slow.next fast = fast.next.next if slow == fast, return true Return false Runnable Python example (linked_list_cycle.py):\nfrom typing import Optional, List class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def has_cycle(head: Optional[ListNode]) -\u0026gt; bool: slow = head fast = head while fast is not None and fast.next is not None: slow = slow.next fast = fast.next.next if slow is fast: return True return False def build(values: List[int], pos: int) -\u0026gt; Optional[ListNode]: if not values: return None nodes = [ListNode(v) for v in values] for i in range(len(nodes) - 1): nodes[i].next = nodes[i + 1] if pos != -1: nodes[-1].next = nodes[pos] return nodes[0] if __name__ == \u0026#34;__main__\u0026#34;: print(has_cycle(build([3, 2, 0, -4], 1))) # True print(has_cycle(build([1, 2], -1))) # False Explanation / Why This Works The algorithm has two outcomes:\nNo cycle: fast pointer reaches null first =\u0026gt; return false Has cycle: fast and slow eventually meet in cycle =\u0026gt; return true No extra memory is needed because we do not store history. We rely on relative speed and finite cycle length.\nE - Engineering (Real-world Scenarios) Scenario 1: free-list integrity check in memory pools (C) Background: low-level allocators often keep free blocks as a singly linked list.\nWhy it fits: if free list becomes cyclic, allocation/release may hang; Floyd check is O(1) memory.\nint hasCycle(struct Node* head) { struct Node* slow = head; struct Node* fast = head; while (fast \u0026amp;\u0026amp; fast-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; if (slow == fast) return 1; } return 0; } Scenario 2: backend workflow next-pointer validation (Go) Background: lightweight workflow nodes may be chained by Next.\nWhy it fits: config bugs can create loops and stall execution.\nfunc hasCycle(head *Node) bool { slow, fast := head, head for fast != nil \u0026amp;\u0026amp; fast.Next != nil { slow = slow.Next fast = fast.Next.Next if slow == fast { return true } } return false } Scenario 3: linked object debug check in browser tools (JavaScript) Background: front-end tooling may use chain objects with next links.\nWhy it fits: quickly detect accidental circular links before traversing.\nfunction hasCycle(head) { let slow = head, fast = head; while (fast \u0026amp;\u0026amp; fast.next) { slow = slow.next; fast = fast.next.next; if (slow === fast) return true; } return false; } R - Reflection Complexity Time: O(n) Space: O(1) (Floyd), versus O(n) for hash-set approach Alternatives and Tradeoffs Method Time Space Tradeoff Hash set visited nodes O(n) O(n) easy but memory-heavy Marker on node O(n) O(1) mutates structure; usually forbidden Floyd fast/slow O(n) O(1) best practical baseline Common Mistakes Compare values instead of node identity Forget null checks before double-step fast move Check slow == fast before first movement (trivial true at start) Assume pos is passed to function Why this is engineering-optimal in most cases It balances runtime, memory, and non-intrusiveness:\nlinear scan constant memory no mutation FAQ and Notes Can this also find cycle entry?\nYes, that is LeetCode 142 (extra phase after first meeting).\nWhat if list is very large?\nStill linear and memory-safe versus visited set growth.\nWhen is hash set preferable?\nIf you also need to record traversal path or list all repeated nodes.\nBest Practices Always use while fast \u0026amp;\u0026amp; fast.next Compare identity (is, ===, pointer equality), not value Keep this template as your default cycle check S - Summary Cycle detection is about pointer identity and relative speed, not values. Floyd detects cycles in O(n) time with O(1) extra space. Null-check ordering is critical for safety. This template maps directly to chain integrity checks in real systems. Recommended Further Reading LeetCode 141. Linked List Cycle LeetCode 142. Linked List Cycle II Floyd’s cycle detection variants and proofs Conclusion LeetCode 141 is a foundational two-pointer template. Once internalized, it becomes a reusable structural safety check across many chain-based systems.\nReferences https://leetcode.com/problems/linked-list-cycle/ https://leetcode.com/problems/linked-list-cycle-ii/ https://en.wikipedia.org/wiki/Cycle_detection https://en.cppreference.com/w/cpp/container/forward_list Meta Info Reading time: 10-12 min Tags: Hot100, linked list, Floyd, fast slow pointers, LeetCode 141 SEO keywords: Linked List Cycle, Floyd, fast slow pointers, O(1), LeetCode 141 Meta description: Floyd fast/slow pointers detect linked-list cycle in O(n)/O(1), with proof intuition and multi-language implementations. Call To Action (CTA) After this article, solve 142 immediately:\ndetect cycle (this problem) find cycle entry (next step) Treat them as one template family.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import Optional, List class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def hasCycle(head: Optional[ListNode]) -\u0026gt; bool: slow = head fast = head while fast is not None and fast.next is not None: slow = slow.next fast = fast.next.next if slow is fast: return True return False struct ListNode { int val; struct ListNode* next; }; int hasCycle(struct ListNode* head) { struct ListNode* slow = head; struct ListNode* fast = head; while (fast \u0026amp;\u0026amp; fast-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; if (slow == fast) return 1; } return 0; } struct ListNode { int val; ListNode* next; ListNode(int x) : val(x), next(nullptr) {} }; bool hasCycle(ListNode* head) { ListNode* slow = head; ListNode* fast = head; while (fast \u0026amp;\u0026amp; fast-\u0026gt;next) { slow = slow-\u0026gt;next; fast = fast-\u0026gt;next-\u0026gt;next; if (slow == fast) return true; } return false; } package main type ListNode struct { Val int Next *ListNode } func hasCycle(head *ListNode) bool { slow, fast := head, head for fast != nil \u0026amp;\u0026amp; fast.Next != nil { slow = slow.Next fast = fast.Next.Next if slow == fast { return true } } return false } use std::cell::RefCell; use std::rc::Rc; #[derive(Debug)] struct Node { val: i32, next: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt;, } fn next(node: \u0026amp;Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt;) -\u0026gt; Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt; { node.as_ref().and_then(|rc| rc.borrow().next.clone()) } fn has_cycle(head: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;Node\u0026gt;\u0026gt;\u0026gt;) -\u0026gt; bool { let mut slow = head.clone(); let mut fast = head; loop { slow = next(\u0026amp;slow); fast = next(\u0026amp;fast); if fast.is_none() || slow.is_none() { return false; } fast = next(\u0026amp;fast); if fast.is_none() { return false; } if let (Some(ref s), Some(ref f)) = (\u0026amp;slow, \u0026amp;fast) { if Rc::ptr_eq(s, f) { return true; } } else { return false; } } } function hasCycle(head) { let slow = head, fast = head; while (fast \u0026amp;\u0026amp; fast.next) { slow = slow.next; fast = fast.next.next; if (slow === fast) return true; } return false; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/141-linked-list-cycle/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nDetecting a cycle in a linked list is a pointer chasing problem, not a value comparison problem. This ACERS guide explains why Floyd’s fast/slow pointers must meet if a cycle exists, how to avoid null-pointer bugs, and how the same pattern maps to engineering checks.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003efast slow pointers\u003c/code\u003e, \u003ccode\u003eFloyd\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Linked List Cycle, Floyd, fast slow pointers, LeetCode 141, Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: O(n)/O(1) cycle detection in singly linked lists using Floyd fast/slow pointers, with alternatives, common mistakes, and runnable multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners and interview candidates\u003c/li\u003e\n\u003cli\u003eDevelopers building reusable linked-list two-pointer templates\u003c/li\u003e\n\u003cli\u003eEngineers who need to detect loops in chain-like structures\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eCycle bugs are common in pointer-linked structures:\u003c/p\u003e","title":"Hot100: Linked List Cycle Floyd Fast/Slow Pointer ACERS Guide"},{"content":" Subtitle / Summary\nThe core of palindrome validation is symmetric comparison, but a singly linked list cannot move backward. The most stable engineering template is: find middle -\u0026gt; reverse second half in-place -\u0026gt; compare -\u0026gt; reverse back to restore.\nReading time: 10-14 min Tags: Hot100, linked list, fast slow pointers, in-place reverse SEO keywords: Palindrome Linked List, fast slow pointers, reverse second half, O(1) space, LeetCode 234 Meta description: O(n)/O(1) palindrome check for singly linked list with middle detection, second-half reversal, comparison, and full structure restoration. Target Readers Hot100 learners who want to master the \u0026ldquo;middle + reverse\u0026rdquo; linked-list combo Developers who frequently solve palindrome/symmetry interview questions Engineers who care about low extra memory and non-destructive checks Background / Motivation For arrays, palindrome check is easy with two pointers from both ends. For singly linked lists, you can only move forward via next, so symmetric comparison is not direct.\nReal engineering constraints are often similar to this problem:\navoid O(n) extra containers if possible do not permanently mutate the structure keep linear time So we need a template that is:\nO(n) time O(1) extra space restorable (no side effects after checking) Core Concepts Concept Meaning Purpose Palindrome same sequence forward and backward needs symmetric comparison Fast/slow pointers fast moves 2, slow moves 1 find middle in O(n) In-place reverse reverse pointer direction on second half make backward side comparable forward Restore step reverse second half again and reconnect preserve original structure A - Algorithm (Problem and Algorithm) Problem Restatement Given the head of a singly linked list head, return true if it is a palindrome; otherwise return false.\nInput / Output Name Type Description head ListNode head of singly linked list return bool whether list is palindrome Example 1 input: 1 -\u0026gt; 2 -\u0026gt; 2 -\u0026gt; 1 output: true Example 2 input: 1 -\u0026gt; 2 output: false Thought Process: From Array Copy to In-Place Reversal Naive approach: copy to array Traverse list and copy values to an array Use two pointers on array to check palindrome Pros: simple and robust. Cons: needs O(n) extra memory.\nBetter observation If we can reverse only the second half of the list, then both halves become forward-comparable.\nExample:\n1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 2 -\u0026gt; 1 Reverse second half around middle:\nleft side forward: 1 -\u0026gt; 2 -\u0026gt; 3 right side forward: 1 -\u0026gt; 2 Now compare node by node.\nFinal method Find end of first half (fast/slow) Reverse second half Compare first half and reversed second half Reverse back and reconnect (restore) C - Concepts (Core Ideas) Method Category Fast/slow pointer middle finding In-place linked-list reversal Temporary mutation + restoration Stable handling for odd/even lengths A robust implementation uses end_of_first_half:\nodd length: first-half end is exact middle (middle value can be skipped in comparison) even length: first-half end is left-middle Then reverse first_half_end.next, and compare only while second-half pointer is not null. This removes many odd/even branch bugs.\nKey invariant After reversing second half:\np1 starts from head p2 starts from reversed_second_half_head For palindrome lists, p1.val == p2.val for all nodes in second half.\nAfter check:\nreverse second half again reconnect via first_half_end.next So external observers see the original structure.\nPractical Guide / Steps Return true for empty list or single node Find first_half_end by fast/slow pointers Reverse first_half_end.next to get second_half_start Compare head and second_half_start node values Restore: first_half_end.next = reverse(second_half_start) Return comparison result Runnable Python example (palindrome_list.py):\nfrom __future__ import annotations class ListNode: def __init__(self, val: int): self.val = val self.next: ListNode | None = None def reverse_list(head: ListNode | None) -\u0026gt; ListNode | None: prev = None cur = head while cur: nxt = cur.next cur.next = prev prev = cur cur = nxt return prev def end_of_first_half(head: ListNode) -\u0026gt; ListNode: fast = head slow = head while fast.next and fast.next.next: fast = fast.next.next slow = slow.next # type: ignore[assignment] return slow def is_palindrome(head: ListNode | None) -\u0026gt; bool: if head is None or head.next is None: return True first_half_end = end_of_first_half(head) second_half_start = reverse_list(first_half_end.next) p1 = head p2 = second_half_start ok = True while ok and p2 is not None: if p1.val != p2.val: ok = False p1 = p1.next # type: ignore[assignment] p2 = p2.next first_half_end.next = reverse_list(second_half_start) # restore return ok Explanation / Why This Works A singly linked list cannot directly read from tail to head. Reversing the second half transforms the \u0026ldquo;backward side\u0026rdquo; into a forward list. So palindrome checking becomes simple forward pair comparison.\nThe important engineering detail is restoration:\ntemporary mutation is acceptable permanent mutation is usually not By reversing the second half again, we restore exact original topology.\nE - Engineering (Real-world Scenarios) Scenario 1: symmetric event-chain validation (Python) Background: a rule engine stores a session as a linked event chain and needs to detect mirrored behavior patterns.\nWhy it fits: O(1) extra memory check without allocating a full copy.\ndef is_symmetric_chain(head): return is_palindrome(head) Scenario 2: embedded frame-sequence symmetry check (C) Background: in memory-limited systems, sampled frames may be chained via next pointers.\nWhy it fits: avoids O(n) buffer allocation and preserves structure after check.\nstruct ListNode { int val; struct ListNode* next; }; static struct ListNode* reverse(struct ListNode* head) { struct ListNode* prev = 0; struct ListNode* cur = head; while (cur) { struct ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } return prev; } Scenario 3: browser operation-history mirror detection (JavaScript) Background: editor actions are represented as a linked list in a demo tool.\nWhy it fits: pointer-based structure allows direct reuse of middle+reverse template.\nfunction reverse(head) { let prev = null; let cur = head; while (cur) { const nxt = cur.next; cur.next = prev; prev = cur; cur = nxt; } return prev; } R - Reflection Complexity Time: O(n) (middle find + reverse + compare + restore are all linear) Space: O(1) Alternatives and Tradeoffs Method Time Extra Space Tradeoff Copy to array O(n) O(n) easy but memory-heavy Stack half values O(n) O(n) less copy than full array, still linear extra space Recursion compare O(n) O(n) stack risk on long lists Reverse second half (current) O(n) O(1) most practical, but needs careful restore Common Mistakes Forget restore step, leaving list mutated Middle handling bug for odd/even length Wrong compare range (comparing beyond second half) Missing cycle assumption in non-LeetCode environments Why this method is practical optimum It achieves linear time with constant extra memory, and can preserve original list by restoration. This is usually the best tradeoff under production-style constraints.\nFAQ and Notes Why compare only while p2 is not null?\nSecond half length is less than or equal to first half length; this covers all mirrored pairs.\nWhat if list has a cycle?\nLeetCode input has no cycle. In production, detect cycle first (Floyd) before this template.\nWill reversal damage original structure?\nTemporarily yes; final reverse+reconnect restores it.\nCan I skip restore in interview?\nDepends on interviewer constraints. In engineering code, restoration is strongly recommended.\nBest Practices Use one stable template: first_half_end + reverse(first_half_end.next) Keep restore step mandatory unless explicitly allowed to mutate Test both odd and even lengths, plus edge cases ([], [x], non-palindrome near center) S - Summary Singly linked lists cannot directly do backward traversal for symmetry checks. Fast/slow pointers locate split point in O(n). Reversing second half converts backward comparison into forward comparison. Restore step preserves original structure and avoids side effects. This template transfers to many list problems using \u0026ldquo;middle + half processing\u0026rdquo;. Recommended Further Reading LeetCode 234. Palindrome Linked List LeetCode 206. Reverse Linked List LeetCode 143. Reorder List LeetCode 876. Middle of the Linked List Conclusion The value of this problem is not only palindrome checking. It is a reusable engineering pattern: locate middle, temporarily transform half, compare, restore.\nReferences https://leetcode.com/problems/palindrome-linked-list/ https://leetcode.com/problems/reverse-linked-list/ https://leetcode.com/problems/middle-of-the-linked-list/ https://en.cppreference.com/w/cpp/container/forward_list Meta Info Reading time: 10-14 min Tags: Hot100, linked list, palindrome, fast slow pointers, in-place reverse SEO keywords: Palindrome Linked List, reverse second half, O(1) space, LeetCode 234, Hot100 Meta description: O(n)/O(1) palindrome check by fast/slow split, reverse second half, compare, and restore. Call To Action (CTA) After this one, solve these in order with the same skill set:\n206 Reverse Linked List 143 Reorder List 92 Reverse Linked List II Treat them as one template family, not isolated questions.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) from __future__ import annotations class ListNode: def __init__(self, val: int): self.val = val self.next: ListNode | None = None def reverse_list(head: ListNode | None) -\u0026gt; ListNode | None: prev = None cur = head while cur: nxt = cur.next cur.next = prev prev = cur cur = nxt return prev def end_of_first_half(head: ListNode) -\u0026gt; ListNode: fast, slow = head, head while fast.next and fast.next.next: fast = fast.next.next slow = slow.next # type: ignore[assignment] return slow def is_palindrome(head: ListNode | None) -\u0026gt; bool: if head is None or head.next is None: return True first_half_end = end_of_first_half(head) second_half_start = reverse_list(first_half_end.next) p1, p2 = head, second_half_start ok = True while ok and p2: if p1.val != p2.val: ok = False p1 = p1.next # type: ignore[assignment] p2 = p2.next first_half_end.next = reverse_list(second_half_start) return ok struct ListNode { int val; struct ListNode *next; }; static struct ListNode* reverse(struct ListNode* head) { struct ListNode* prev = 0; struct ListNode* cur = head; while (cur) { struct ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } return prev; } static struct ListNode* endFirstHalf(struct ListNode* head) { struct ListNode* fast = head; struct ListNode* slow = head; while (fast-\u0026gt;next \u0026amp;\u0026amp; fast-\u0026gt;next-\u0026gt;next) { fast = fast-\u0026gt;next-\u0026gt;next; slow = slow-\u0026gt;next; } return slow; } int isPalindrome(struct ListNode* head) { if (!head || !head-\u0026gt;next) return 1; struct ListNode* firstEnd = endFirstHalf(head); struct ListNode* second = reverse(firstEnd-\u0026gt;next); int ok = 1; struct ListNode *p1 = head, *p2 = second; while (ok \u0026amp;\u0026amp; p2) { if (p1-\u0026gt;val != p2-\u0026gt;val) ok = 0; p1 = p1-\u0026gt;next; p2 = p2-\u0026gt;next; } firstEnd-\u0026gt;next = reverse(second); return ok; } struct ListNode { int val; ListNode *next; ListNode(int x) : val(x), next(nullptr) {} }; static ListNode* reverse(ListNode* head) { ListNode* prev = nullptr; ListNode* cur = head; while (cur) { ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } return prev; } static ListNode* endFirstHalf(ListNode* head) { ListNode* fast = head; ListNode* slow = head; while (fast-\u0026gt;next \u0026amp;\u0026amp; fast-\u0026gt;next-\u0026gt;next) { fast = fast-\u0026gt;next-\u0026gt;next; slow = slow-\u0026gt;next; } return slow; } bool isPalindrome(ListNode* head) { if (!head || !head-\u0026gt;next) return true; ListNode* firstEnd = endFirstHalf(head); ListNode* second = reverse(firstEnd-\u0026gt;next); bool ok = true; ListNode *p1 = head, *p2 = second; while (ok \u0026amp;\u0026amp; p2) { if (p1-\u0026gt;val != p2-\u0026gt;val) ok = false; p1 = p1-\u0026gt;next; p2 = p2-\u0026gt;next; } firstEnd-\u0026gt;next = reverse(second); return ok; } package main type ListNode struct { Val int Next *ListNode } func reverse(head *ListNode) *ListNode { var prev *ListNode cur := head for cur != nil { nxt := cur.Next cur.Next = prev prev = cur cur = nxt } return prev } func endFirstHalf(head *ListNode) *ListNode { fast, slow := head, head for fast.Next != nil \u0026amp;\u0026amp; fast.Next.Next != nil { fast = fast.Next.Next slow = slow.Next } return slow } func isPalindrome(head *ListNode) bool { if head == nil || head.Next == nil { return true } firstEnd := endFirstHalf(head) second := reverse(firstEnd.Next) ok := true p1, p2 := head, second for ok \u0026amp;\u0026amp; p2 != nil { if p1.Val != p2.Val { ok = false } p1 = p1.Next p2 = p2.Next } firstEnd.Next = reverse(second) return ok } #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { #[inline] pub fn new(val: i32) -\u0026gt; Self { ListNode { next: None, val } } } pub fn is_palindrome(head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;) -\u0026gt; bool { let mut vals: Vec\u0026lt;i32\u0026gt; = Vec::new(); let mut cur = head.as_ref(); while let Some(node) = cur { vals.push(node.val); cur = node.next.as_ref(); } let mut i = 0usize; let mut j = vals.len().saturating_sub(1); while i \u0026lt; j { if vals[i] != vals[j] { return false; } i += 1; j = j.saturating_sub(1); } true } function reverse(head) { let prev = null; let cur = head; while (cur) { const nxt = cur.next; cur.next = prev; prev = cur; cur = nxt; } return prev; } function endFirstHalf(head) { let fast = head, slow = head; while (fast.next \u0026amp;\u0026amp; fast.next.next) { fast = fast.next.next; slow = slow.next; } return slow; } function isPalindrome(head) { if (!head || !head.next) return true; const firstEnd = endFirstHalf(head); const second = reverse(firstEnd.next); let ok = true; let p1 = head, p2 = second; while (ok \u0026amp;\u0026amp; p2) { if (p1.val !== p2.val) ok = false; p1 = p1.next; p2 = p2.next; } firstEnd.next = reverse(second); return ok; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/234-palindrome-linked-list/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThe core of palindrome validation is symmetric comparison, but a singly linked list cannot move backward. The most stable engineering template is: \u003cstrong\u003efind middle -\u0026gt; reverse second half in-place -\u0026gt; compare -\u0026gt; reverse back to restore\u003c/strong\u003e.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-14 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003efast slow pointers\u003c/code\u003e, \u003ccode\u003ein-place reverse\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Palindrome Linked List, fast slow pointers, reverse second half, O(1) space, LeetCode 234\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: O(n)/O(1) palindrome check for singly linked list with middle detection, second-half reversal, comparison, and full structure restoration.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want to master the \u0026ldquo;middle + reverse\u0026rdquo; linked-list combo\u003c/li\u003e\n\u003cli\u003eDevelopers who frequently solve palindrome/symmetry interview questions\u003c/li\u003e\n\u003cli\u003eEngineers who care about low extra memory and non-destructive checks\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eFor arrays, palindrome check is easy with two pointers from both ends.\nFor singly linked lists, you can only move forward via \u003ccode\u003enext\u003c/code\u003e, so symmetric comparison is not direct.\u003c/p\u003e","title":"Hot100: Palindrome Linked List Fast/Slow + Reverse Second Half O(1) Space ACERS Guide"},{"content":" Subtitle / Summary\nReverse Linked List is the first serious pointer-rewiring exercise in Hot100. It looks simple, but most bugs come from broken links and wrong operation order. This ACERS guide explains the three-pointer iterative template thoroughly and compares it with recursion.\nReading time: 10-12 min Tags: Hot100, linked list, pointer, iteration SEO keywords: Hot100, Reverse Linked List, three pointers, iterative, recursive, LeetCode 206 Meta description: Three-pointer iterative reversal in O(n)/O(1), with recursive contrast, common pitfalls, engineering mapping, and runnable multi-language implementations. Target Readers Hot100 learners and interview candidates Developers who often hit null-pointer or broken-chain bugs in list problems Engineers who want stable pointer manipulation patterns in C/C++/Rust/Go Background / Motivation In production code, \u0026ldquo;reverse linked list\u0026rdquo; may not appear as a LeetCode function, but the skill is highly transferable:\nReorder nodes in-place with O(1) extra memory Keep link integrity while changing direction Handle head = null and single-node lists without special-case chaos If this template is truly internalized, many list problems become straightforward: reverse sublist, reverse k-group, palindrome list, and so on.\nCore Concepts Singly linked list: each node has one next pointer Broken-link risk: if you overwrite cur.next before saving old next, you lose the remaining chain Three pointers (prev, cur, next): save successor, reverse link, then advance Loop invariant: prev is always the head of the already reversed part cur is always the head of the not-yet-processed part A - Algorithm (Problem and Algorithm) Problem Restatement Given the head of a singly linked list, reverse the list and return the new head.\nInput / Output Name Type Description head ListNode head of singly linked list (can be null) return ListNode new head after reversal Example 1 input: 1 -\u0026gt; 2 -\u0026gt; 3 -\u0026gt; 4 -\u0026gt; 5 -\u0026gt; null output: 5 -\u0026gt; 4 -\u0026gt; 3 -\u0026gt; 2 -\u0026gt; 1 -\u0026gt; null Example 2 input: 1 -\u0026gt; 2 -\u0026gt; null output: 2 -\u0026gt; 1 -\u0026gt; null Thought Process: From Naive to In-Place Naive idea: copy values and rebuild Traverse list, collect values, rebuild in reverse order Works, but uses O(n) extra memory and creates new nodes For interviews and systems code, this is usually not what is asked.\nKey observation You do not need new nodes.\nYou only need to rewire next.\nFor current node cur:\nSave old successor: next = cur.next Reverse pointer: cur.next = prev Move forward: prev = cur, cur = next Method choice Use iterative three-pointer template:\nTime: O(n) Extra space: O(1) Stable and stack-safe C - Concepts (Core Ideas) Method Category In-place linked-list manipulation Iterative simulation Recursion as equivalent reference solution Loop Invariant (why it is correct) At each loop start:\nprev points to a valid reversed list cur points to the first node not yet reversed Original nodes are partitioned into: reversed prefix untouched suffix Each iteration moves exactly one node from untouched suffix to reversed prefix. When cur == null, all nodes are in reversed prefix, and prev is the new head.\nRecursive counterpart Recursive idea:\nReverse head.next onward and get new_head Let head.next.next = head Set head.next = null to cut old forward link Readable, but stack usage is O(n).\nPractical Guide / Steps Initialize prev = null, cur = head While cur != null: next = cur.next cur.next = prev prev = cur cur = next Return prev Runnable Python example (reverse_list.py):\nclass ListNode: def __init__(self, val=0, next=None): self.val = val self.next = next def reverse_list(head): prev = None cur = head while cur is not None: nxt = cur.next cur.next = prev prev = cur cur = nxt return prev def from_list(arr): dummy = ListNode() tail = dummy for x in arr: tail.next = ListNode(x) tail = tail.next return dummy.next def to_list(head): ans = [] while head: ans.append(head.val) head = head.next return ans if __name__ == \u0026#34;__main__\u0026#34;: h = from_list([1, 2, 3, 4, 5]) print(to_list(reverse_list(h))) Explanation / Why This Works The order of operations is the whole point:\nSave next first (avoid losing remaining chain) Reverse current edge (cur.next = prev) Advance both pointers If you swap step 1 and 2, the rest of list may become unreachable. That is the most common bug in this problem.\nE - Engineering (Real-world Scenarios) Scenario 1: free-list reorder in memory-oriented systems (C) Background: some allocators keep free blocks in a singly linked free-list.\nWhy it fits: reversing list order is an in-place strategy to alter reuse order without allocating memory.\n#include \u0026lt;stdio.h\u0026gt; typedef struct Node { int id; struct Node* next; } Node; Node* reverse(Node* head) { Node* prev = NULL; Node* cur = head; while (cur) { Node* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } return prev; } int main(void) { Node c = {3, NULL}; Node b = {2, \u0026amp;c}; Node a = {1, \u0026amp;b}; Node* head = reverse(\u0026amp;a); for (Node* p = head; p; p = p-\u0026gt;next) printf(\u0026#34;%d \u0026#34;, p-\u0026gt;id); printf(\u0026#34;\\n\u0026#34;); return 0; } Scenario 2: server-side operation stack replay direction switch (Go) Background: a lightweight task chain is stored as a singly linked stack.\nWhy it fits: reversing in-place switches replay direction without extra containers.\npackage main import \u0026#34;fmt\u0026#34; type Node struct { Val int Next *Node } func reverse(head *Node) *Node { var prev *Node cur := head for cur != nil { nxt := cur.Next cur.Next = prev prev = cur cur = nxt } return prev } func main() { head := \u0026amp;Node{1, \u0026amp;Node{2, \u0026amp;Node{3, nil}}} head = reverse(head) for p := head; p != nil; p = p.Next { fmt.Print(p.Val, \u0026#34; \u0026#34;) } fmt.Println() } Scenario 3: pointer-animation teaching in browser (JavaScript) Background: in algorithm visualization, object references simulate list nodes.\nWhy it fits: the three-pointer state transition is easy to animate frame-by-frame.\nfunction Node(val, next = null) { this.val = val; this.next = next; } function reverse(head) { let prev = null; let cur = head; while (cur) { const nxt = cur.next; cur.next = prev; prev = cur; cur = nxt; } return prev; } let head = new Node(1, new Node(2, new Node(3))); head = reverse(head); const out = []; for (let p = head; p; p = p.next) out.push(p.val); console.log(out.join(\u0026#34; \u0026#34;)); R - Reflection Complexity Iterative: Time: O(n) Space: O(1) Recursive: Time: O(n) Space: O(n) (call stack) Alternatives and Tradeoffs Method Time Extra Space Tradeoff Rebuild new list O(n) O(n) Easy but not in-place Recursive reversal O(n) O(n) Elegant, but stack-risk on long lists Three-pointer iterative O(n) O(1) Best engineering baseline Common Mistakes Rewire before saving next (break chain) Forget advancing cur (infinite loop) Recursive version forgetting head.next = null (cycle risk) Reversing values instead of links (misses structural requirement) Why iterative is most practical Stack-safe for very long lists Local, inspectable pointer transitions Better fit for systems programming and production safety constraints FAQ and Notes What about empty list or single node?\nThe same loop handles both naturally.\nIs three pointers mandatory?\nYou must preserve successor somehow; variable names can differ, state is equivalent.\nWhy not prefer recursion since it\u0026rsquo;s shorter?\nRecursive depth can overflow stack on long input; iterative is safer as default.\nHow to self-check quickly?\nUse this rule: save next first, reverse next, then advance.\nBest Practices Memorize the operation order as a fixed template Draw pointer state for at least 2-3 iterations before coding Use iterative as default in production-quality code S - Summary Reverse Linked List is fundamentally pointer rewiring, not value swapping. Three-pointer iterative template achieves O(n) time and O(1) extra space. Correct order is the core: save successor -\u0026gt; reverse link -\u0026gt; advance. Recursive form is useful for understanding, but iterative is usually safer for engineering. Recommended Further Reading LeetCode 206. Reverse Linked List LeetCode 92. Reverse Linked List II LeetCode 25. Reverse Nodes in k-Group LeetCode 234. Palindrome Linked List Conclusion Once the three-pointer template is stable in your muscle memory, most linked-list reversal variants become local modifications rather than new problems.\nReferences https://leetcode.com/problems/reverse-linked-list/ https://en.cppreference.com/w/cpp/container/forward_list https://doc.rust-lang.org/std/option/enum.Option.html https://go.dev/tour/moretypes/6 Meta Info Reading time: 10-12 min Tags: Hot100, linked list, pointer, iteration, LeetCode 206 SEO keywords: Reverse Linked List, three pointers, iterative, recursive, LeetCode 206, Hot100 Meta description: O(n)/O(1) linked-list reversal with three pointers, with recursive comparison and runnable multi-language implementations. Call To Action (CTA) Do two drills to lock this in:\nManually trace prev/cur/next for at least three steps without code Solve LeetCode 92 right after this one to reuse the same rewiring pattern Multi-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import Optional class ListNode: def __init__(self, val: int = 0, next: Optional[\u0026#34;ListNode\u0026#34;] = None): self.val = val self.next = next def reverseList(head: Optional[ListNode]) -\u0026gt; Optional[ListNode]: prev = None cur = head while cur is not None: nxt = cur.next cur.next = prev prev = cur cur = nxt return prev #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct ListNode { int val; struct ListNode* next; }; struct ListNode* reverseList(struct ListNode* head) { struct ListNode* prev = NULL; struct ListNode* cur = head; while (cur) { struct ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } return prev; } #include \u0026lt;iostream\u0026gt; struct ListNode { int val; ListNode* next; ListNode(int x) : val(x), next(nullptr) {} }; ListNode* reverseList(ListNode* head) { ListNode* prev = nullptr; ListNode* cur = head; while (cur) { ListNode* nxt = cur-\u0026gt;next; cur-\u0026gt;next = prev; prev = cur; cur = nxt; } return prev; } package main type ListNode struct { Val int Next *ListNode } func reverseList(head *ListNode) *ListNode { var prev *ListNode cur := head for cur != nil { nxt := cur.Next cur.Next = prev prev = cur cur = nxt } return prev } #[derive(PartialEq, Eq, Clone, Debug)] pub struct ListNode { pub val: i32, pub next: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;, } impl ListNode { pub fn new(val: i32) -\u0026gt; Self { ListNode { val, next: None } } } pub fn reverse_list(mut head: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt;) -\u0026gt; Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; { let mut prev: Option\u0026lt;Box\u0026lt;ListNode\u0026gt;\u0026gt; = None; while let Some(mut node) = head { head = node.next.take(); node.next = prev; prev = Some(node); } prev } function ListNode(val, next = null) { this.val = val; this.next = next; } function reverseList(head) { let prev = null; let cur = head; while (cur) { const nxt = cur.next; cur.next = prev; prev = cur; cur = nxt; } return prev; } ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/206-reverse-linked-list/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nReverse Linked List is the first serious pointer-rewiring exercise in Hot100. It looks simple, but most bugs come from broken links and wrong operation order. This ACERS guide explains the three-pointer iterative template thoroughly and compares it with recursion.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003epointer\u003c/code\u003e, \u003ccode\u003eiteration\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Reverse Linked List, three pointers, iterative, recursive, LeetCode 206\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Three-pointer iterative reversal in O(n)/O(1), with recursive contrast, common pitfalls, engineering mapping, and runnable multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners and interview candidates\u003c/li\u003e\n\u003cli\u003eDevelopers who often hit null-pointer or broken-chain bugs in list problems\u003c/li\u003e\n\u003cli\u003eEngineers who want stable pointer manipulation patterns in C/C++/Rust/Go\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eIn production code, \u0026ldquo;reverse linked list\u0026rdquo; may not appear as a LeetCode function, but the skill is highly transferable:\u003c/p\u003e","title":"Hot100: Reverse Linked List Three-Pointer Iterative/Recursive ACERS Guide"},{"content":" Subtitle / Summary\nThis is Hot100 article #1 for the series: Subarray Sum Equals K. We reduce the naive O(n^2) approach to O(n) with prefix sum plus a frequency hash map, then map the same pattern to real engineering scenarios.\nReading time: 12-15 min Tags: Hot100, prefix sum, hash map SEO keywords: Subarray Sum Equals K, prefix sum, hash map, O(n), Hot100 Meta description: O(n) counting of subarrays with sum k using prefix sum + hash map, with complexity analysis and runnable multi-language code. Target Readers Hot100 learners who want stable reusable templates Intermediate engineers who want to transfer counting patterns to real data pipelines Interview prep readers who want to master prefix sum + hash map Background / Motivation \u0026ldquo;Count subarrays whose sum equals k\u0026rdquo; is one of the most classic counting problems. It appears in log analytics, risk threshold hits, and transaction sequence statistics. The two-loop brute force method is straightforward, but slows down quickly as input grows. So we need an O(n) method that scales.\nCore Concepts (Must Understand) Subarray: a continuous non-empty segment in an array Prefix sum: cumulative sum from start to a position Difference relation: if prefix[r] - prefix[l-1] = k, then nums[l..r] sums to k Frequency hash map: count how many times each prefix sum has appeared A - Algorithm Problem Restatement Given an integer array nums and an integer k, return the total number of subarrays whose sum equals k. A subarray must be continuous and non-empty.\nInput / Output Name Type Description nums int[] integer array k int target sum return int number of subarrays with sum k Example 1 nums = [1, 1, 1], k = 2 Valid subarrays are [1,1] at indices (0..1) and (1..2).\nOutput: 2\nExample 2 nums = [1, 2, 3], k = 3 Valid subarrays are [1,2] and [3].\nOutput: 2\nC - Concepts Method Category Prefix sum + frequency hash map, a standard counting pattern.\nKey Formula Define prefix sum as:\nprefix[0] = 0 prefix[i] = nums[0] + nums[1] + ... + nums[i-1] Then subarray sum:\nsum(l..r) = prefix[r+1] - prefix[l] To make it equal k, we need:\nprefix[l] = prefix[r+1] - k Core Idea Scan from left to right with running sum s. At each element:\nCount how many previous prefix sums equal s - k Add that count to answer Insert current s into frequency map This order (\u0026ldquo;count first, then insert\u0026rdquo;) prevents missing valid subarrays.\nPractical Guide / Steps Initialize s = 0, ans = 0, count = {0: 1} For each element x in nums: s += x ans += count.get(s - k, 0) count[s] = count.get(s, 0) + 1 Return ans Runnable Example (Python) from typing import List def subarray_sum(nums: List[int], k: int) -\u0026gt; int: count = {0: 1} ans = 0 s = 0 for x in nums: s += x ans += count.get(s - k, 0) count[s] = count.get(s, 0) + 1 return ans if __name__ == \u0026#34;__main__\u0026#34;: print(subarray_sum([1, 1, 1], 2)) print(subarray_sum([1, 2, 3], 3)) Run:\npython3 demo.py Explanation / Why This Works The hard part is continuity: subarrays must be continuous. Prefix sum turns \u0026ldquo;continuous range sum\u0026rdquo; into \u0026ldquo;difference of two prefix sums\u0026rdquo;. So counting subarrays becomes counting how many previous prefix sums match s - k.\nThis is also why sliding window is unreliable here: if negative numbers exist, window monotonicity breaks and common window rules fail.\nE - Engineering Scenario 1: transaction stream threshold hit counting (Python) Background: count how many contiguous day ranges have net amount exactly k.\nWhy it fits: amounts can be positive/negative; sliding window is not stable.\ndef count_exact_k(amounts, k): count = {0: 1} s = 0 ans = 0 for x in amounts: s += x ans += count.get(s - k, 0) count[s] = count.get(s, 0) + 1 return ans print(count_exact_k([3, -1, 2, 1, -2, 4], 3)) Scenario 2: service monitoring replay window counting (Go) Background: count contiguous windows where error count sum equals k during offline replay.\nWhy it fits: large log arrays need O(n) throughput.\npackage main import \u0026#34;fmt\u0026#34; func countExactK(nums []int, k int) int { count := map[int]int{0: 1} sum := 0 ans := 0 for _, x := range nums { sum += x ans += count[sum-k] count[sum]++ } return ans } func main() { fmt.Println(countExactK([]int{1, 2, 3, -2, 2}, 3)) } Scenario 3: front-end cart threshold hint (JavaScript) Background: count contiguous product price blocks that exactly hit promotion threshold k.\nWhy it fits: lightweight in-browser counting without backend round trip.\nfunction countExactK(nums, k) { const count = new Map(); count.set(0, 1); let sum = 0; let ans = 0; for (const x of nums) { sum += x; ans += count.get(sum - k) || 0; count.set(sum, (count.get(sum) || 0) + 1); } return ans; } console.log(countExactK([5, -1, 2, 4, -2], 4)); R - Reflection Complexity Time: O(n) Space: O(n) Alternatives and Tradeoffs Method Time Space Notes Brute force double loop O(n^2) O(1) Simple but slow Prefix sum + hash map O(n) O(n) Current method, practical optimum Sorted prefix / tree structure O(n log n) O(n) Useful in related variants, but heavier Common Wrong Ideas Sliding window: only reliable for non-negative constraints Missing count[0] = 1: loses subarrays starting at index 0 32-bit accumulation risk: use 64-bit where overflow is possible Why This Is Optimal You must inspect each element at least once, so lower bound is O(n). Hash map gives amortized O(1) lookup/insert per step, reaching that bound.\nFAQ and Notes What if array contains negatives?\nPrefix-sum counting handles negatives naturally and remains correct.\nCounterexample for sliding window: nums = [1, -1, 1], k = 1.\nCorrect answer is 3 ([1], [1,-1,1], [1]), but positive-window rules miss cases.\nCan large k overflow integer sum?\nUse 64-bit running sum in languages where int may overflow.\nIs subarray the same as subsequence?\nNo. Subarray is continuous. Subsequence is not.\nBest Practices Keep this as a fixed template: prefix sum + frequency map Always initialize count[0] = 1 Prefer 64-bit for running sum on large values Add tests for negatives, all zeros, and k = 0 S - Summary Continuous-range sum counting can be converted to prefix-sum difference counting. Frequency hash map reduces counting from O(n^2) to O(n). Sliding window is not generally correct when negatives are present. count[0] = 1 is a critical correctness detail. This pattern transfers well to logs, transactions, and monitoring streams. Recommended Further Reading LeetCode 560 - Subarray Sum Equals K Prefix Sum data structure patterns Hash-map frequency counting templates Sliding window applicability conditions Conclusion The value of this problem is not a one-off trick. It is a reusable counting model. Once internalized, you can solve many \u0026ldquo;continuous range count\u0026rdquo; problems quickly and safely.\nReferences https://leetcode.com/problems/subarray-sum-equals-k/ https://cp-algorithms.com/data_structures/prefix_sum.html https://en.cppreference.com/w/cpp/container/unordered_map https://doc.rust-lang.org/std/collections/struct.HashMap.html Meta Info Reading time: 12-15 min Tags: Hot100, prefix sum, hash map, counting SEO keywords: Subarray Sum Equals K, prefix sum, hash map, O(n) Meta description: Count subarrays with sum k in O(n) using prefix sum + hash map, with engineering mapping and multi-language code. Call To Action (CTA) If you are doing Hot100, do not just memorize answers. Write each problem as \u0026ldquo;pattern + engineering mapping\u0026rdquo; and keep a reusable template set. Share your own variant in comments if you adapt this pattern to a production case.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import List def subarray_sum(nums: List[int], k: int) -\u0026gt; int: count = {0: 1} ans = 0 s = 0 for x in nums: s += x ans += count.get(s - k, 0) count[s] = count.get(s, 0) + 1 return ans if __name__ == \u0026#34;__main__\u0026#34;: print(subarray_sum([1, 1, 1], 2)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef struct { long long key; int val; int used; } Entry; static unsigned long long hash_ll(long long x) { return (unsigned long long)x * 11400714819323198485ull; } static int find_slot(Entry *table, int cap, long long key, int *found) { unsigned long long mask = (unsigned long long)cap - 1ull; unsigned long long idx = hash_ll(key) \u0026amp; mask; while (table[idx].used \u0026amp;\u0026amp; table[idx].key != key) { idx = (idx + 1ull) \u0026amp; mask; } *found = table[idx].used \u0026amp;\u0026amp; table[idx].key == key; return (int)idx; } int subarray_sum(const int *nums, int n, int k) { int cap = 1; while (cap \u0026lt; n * 2) cap \u0026lt;\u0026lt;= 1; if (cap \u0026lt; 2) cap = 2; Entry *table = (Entry *)calloc((size_t)cap, sizeof(Entry)); if (!table) return 0; long long sum = 0; int ans = 0; int found = 0; int pos = find_slot(table, cap, 0, \u0026amp;found); table[pos].used = 1; table[pos].key = 0; table[pos].val = 1; for (int i = 0; i \u0026lt; n; ++i) { sum += nums[i]; pos = find_slot(table, cap, sum - k, \u0026amp;found); if (found) ans += table[pos].val; pos = find_slot(table, cap, sum, \u0026amp;found); if (found) { table[pos].val += 1; } else { table[pos].used = 1; table[pos].key = sum; table[pos].val = 1; } } free(table); return ans; } int main(void) { int nums[] = {1, 1, 1}; printf(\u0026#34;%d\\n\u0026#34;, subarray_sum(nums, 3, 2)); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;unordered_map\u0026gt; #include \u0026lt;vector\u0026gt; int subarraySum(const std::vector\u0026lt;int\u0026gt; \u0026amp;nums, int k) { std::unordered_map\u0026lt;long long, int\u0026gt; count; count[0] = 1; long long sum = 0; int ans = 0; for (int x : nums) { sum += x; auto it = count.find(sum - k); if (it != count.end()) { ans += it-\u0026gt;second; } count[sum] += 1; } return ans; } int main() { std::vector\u0026lt;int\u0026gt; nums{1, 1, 1}; std::cout \u0026lt;\u0026lt; subarraySum(nums, 2) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func subarraySum(nums []int, k int) int { count := map[int]int{0: 1} sum := 0 ans := 0 for _, x := range nums { sum += x ans += count[sum-k] count[sum]++ } return ans } func main() { fmt.Println(subarraySum([]int{1, 1, 1}, 2)) } use std::collections::HashMap; fn subarray_sum(nums: \u0026amp;[i32], k: i32) -\u0026gt; i32 { let mut count: HashMap\u0026lt;i64, i32\u0026gt; = HashMap::new(); count.insert(0, 1); let mut sum: i64 = 0; let mut ans: i32 = 0; for \u0026amp;x in nums { sum += x as i64; if let Some(v) = count.get(\u0026amp;(sum - k as i64)) { ans += *v; } *count.entry(sum).or_insert(0) += 1; } ans } fn main() { let nums = vec![1, 1, 1]; println!(\u0026#34;{}\u0026#34;, subarray_sum(\u0026amp;nums, 2)); } function subarraySum(nums, k) { const count = new Map(); count.set(0, 1); let sum = 0; let ans = 0; for (const x of nums) { sum += x; ans += count.get(sum - k) || 0; count.set(sum, (count.get(sum) || 0) + 1); } return ans; } console.log(subarraySum([1, 1, 1], 2)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/560-subarray-sum-equals-k/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThis is Hot100 article #1 for the series: Subarray Sum Equals K. We reduce the naive O(n^2) approach to O(n) with prefix sum plus a frequency hash map, then map the same pattern to real engineering scenarios.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12-15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003eprefix sum\u003c/code\u003e, \u003ccode\u003ehash map\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Subarray Sum Equals K, prefix sum, hash map, O(n), Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: O(n) counting of subarrays with sum k using prefix sum + hash map, with complexity analysis and runnable multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want stable reusable templates\u003c/li\u003e\n\u003cli\u003eIntermediate engineers who want to transfer counting patterns to real data pipelines\u003c/li\u003e\n\u003cli\u003eInterview prep readers who want to master prefix sum + hash map\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003e\u0026ldquo;Count subarrays whose sum equals k\u0026rdquo; is one of the most classic counting problems.\nIt appears in log analytics, risk threshold hits, and transaction sequence statistics.\nThe two-loop brute force method is straightforward, but slows down quickly as input grows.\nSo we need an O(n) method that scales.\u003c/p\u003e","title":"Hot100: Subarray Sum Equals K Prefix Sum + Hash Map ACERS Guide"},{"content":" This is a \u0026ldquo;graph algorithms topic navigation\u0026rdquo; page. The goal is not to stack articles together, but to give you an executable learning path from basic traversal to distributed graph computation.\nCurrent Directory Status (Topic Structuring Completed) The graph algorithms series has been migrated to:\ncontent/zh/dev/algorithm/graph/ It also uses two-digit prefixes (00/10/20...) to mark reading order, which makes it easier to:\nBrowse in sequence within the file system Insert new articles incrementally later (while preserving numbering gaps) Locate stages quickly during batch maintenance Recommended Reading Order (By Capability Building) Stage 0: Traversal Fundamentals (Lay the Foundation First) BFS / DFS Engineering Intro: k-hop Queries, Subgraph Extraction, and Path Reachability Shortest Path in Practice: Engineering Selection of BFS, Dijkstra, and A* Goals:\nReliably implement iterative graph traversal; Explain when to use BFS and when to use Dijkstra/A*; Build the habit of adding early stop, visited, and budget limits. Stage 1: Reachability and Connectivity Structure (Core of Graph Querying) k-hop and Reachability Queries: BFS Constraints, Reachability Indexes, and 2-hop Labeling Connected Components and SCC: Tarjan / Kosaraju Goals:\nUpgrade \u0026ldquo;can it reach?\u0026rdquo; from one-off search to a system capability; Understand that undirected connectivity and directed strong connectivity are different problem classes; Build a combined mindset of \u0026ldquo;online BFS + offline index.\u0026rdquo; Stage 2: Graph Analytics Metrics (From Reachability to Insight) Graph Centrality: Degree / Betweenness / Closeness PageRank / Personalized PageRank: Node Importance and Incremental Updates Goals:\nExplain different definitions of \u0026ldquo;importance\u0026rdquo; and their applicability boundaries; Apply centrality and PageRank to recommendation, risk control, and influence analysis; Understand that \u0026ldquo;metric is theoretically correct\u0026rdquo; and \u0026ldquo;platform can run it well\u0026rdquo; are different issues. Stage 3: Structure Mining and Matching (Application-Layer Capabilities) Subgraph Matching: VF2, Ullmann, and Pruning Community Detection: Louvain and Label Propagation Goals:\nPerform pattern recognition and rule-graph matching; Make engineering tradeoffs between \u0026ldquo;community quality vs speed\u0026rdquo;; Understand cost-curve differences between matching and clustering. Stage 4: Large-Scale and Dynamic Scenarios (Platform-Level Capabilities) Dynamic Graphs and Incremental Computation: Incremental Shortest Path, Incremental PageRank, and Connectivity Maintenance Graph Partitioning: Edge-cut, Vertex-cut, and METIS Selection Graph Computation Models: Pregel (BSP) and GAS, How to Run PageRank / CC / Parallel BFS Goals:\nDecide when to do full recomputation and when to do incremental updates; Co-design algorithms with partitioning/communication/convergence strategies; Explain the root causes of \u0026ldquo;why this graph workload is slow in distributed environments.\u0026rdquo; Two Practical Study Rhythms Rhythm A (2-Week Sprint, Engineering First) Week 1: Stages 0-1 (articles 1-4) Week 2: Stages 2-4 (articles 5-11) Best for: people who need to connect graph capabilities to business lines quickly.\nRhythm B (4-Week Steady Path, Principles First) Week 1: Traversal and shortest path (1-2) Week 2: Reachability and connectivity (3-4) Week 3: Centrality and PageRank (5-6) Week 4: Matching/community/dynamic graph/partitioning/computation models (7-11) Best for: people building graph platforms or maintaining graph services long term.\nRecommendations for Using This Series After each article, run at least one runnable code sample from the post. Bring your own business graph into the same problem frame (input scale, update frequency, SLA). For every task, write down \u0026ldquo;stop condition + budget + regression baseline\u0026rdquo;; this matters more than memorizing one more formula. Next Steps (Optional) If you continue expanding this series, evolve it in this order:\nFirst, apply a unified tag across all 11 posts (for example, graph-algorithms-series) Then add second-level aggregation pages (fundamentals/analytics/platform) For new posts, prioritize 120/130... numbering to avoid renumbering older files ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/00-graph-algorithms-learning-path/","summary":"\u003cblockquote\u003e\n\u003cp\u003eThis is a \u0026ldquo;graph algorithms topic navigation\u0026rdquo; page. The goal is not to stack articles together, but to give you an executable learning path from basic traversal to distributed graph computation.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003ch2 id=\"current-directory-status-topic-structuring-completed\"\u003eCurrent Directory Status (Topic Structuring Completed)\u003c/h2\u003e\n\u003cp\u003eThe graph algorithms series has been migrated to:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ccode\u003econtent/zh/dev/algorithm/graph/\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eIt also uses two-digit prefixes (\u003ccode\u003e00/10/20...\u003c/code\u003e) to mark reading order, which makes it easier to:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eBrowse in sequence within the file system\u003c/li\u003e\n\u003cli\u003eInsert new articles incrementally later (while preserving numbering gaps)\u003c/li\u003e\n\u003cli\u003eLocate stages quickly during batch maintenance\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"recommended-reading-order-by-capability-building\"\u003eRecommended Reading Order (By Capability Building)\u003c/h2\u003e\n\u003ch3 id=\"stage-0-traversal-fundamentals-lay-the-foundation-first\"\u003eStage 0: Traversal Fundamentals (Lay the Foundation First)\u003c/h3\u003e\n\u003col\u003e\n\u003cli\u003e\u003ca href=\"/jeanblog/dev/algorithm/graph/10-bfs-dfs-k-hop-subgraph-path-existence/\"\u003eBFS / DFS Engineering Intro: k-hop Queries, Subgraph Extraction, and Path Reachability\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/jeanblog/dev/algorithm/graph/20-shortest-path-bfs-dijkstra-astar-acers/\"\u003eShortest Path in Practice: Engineering Selection of BFS, Dijkstra, and A*\u003c/a\u003e\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eGoals:\u003c/p\u003e","title":"Graph Algorithms Learning Path: From BFS to Graph Computation Models"},{"content":" Subtitle / Abstract\nIn graph computing platforms, what defines your upper bound is usually not a single algorithm but the execution model. This article breaks Pregel (BSP) and GAS down to executable reality: how messages flow, how state converges, when it slows down, and how to run parallel BFS.\nEstimated reading time: 16-20 minutes Tags: Pregel, GAS, PageRank, CC, parallel BFS SEO keywords: Pregel, BSP, GAS, PageRank, Connected Components, parallel BFS Meta description: Engineering practice for graph computation models: from Pregel/GAS concepts to runnable implementations of PageRank, CC, and parallel BFS. Target Audience Engineers building graph databases / graph engines / graph analytics platforms Developers who already know BFS/DFS/PageRank but do not know how distributed graph computation is organized Architects who must trade off throughput, latency, and convergence rounds Background / Motivation On the same graph, for the same PageRank task:\nA single-machine script may converge in 10 seconds; A distributed run may still run after 3 minutes; Changing partition strategy may bring it down to 40 seconds. This shows bottlenecks are often not in the formula, but in the execution model.\nThe two most common models in engineering are:\nPregel (BSP): synchronous progress by supersteps; GAS (Gather-Apply-Scatter): aggregate edge contributions, then update state. If you do not understand these two models:\nPageRank stays at formula level without stable convergence behavior; CC (Connected Components) becomes a high-communication implementation; parallel BFS can suffer frontier explosion and stragglers. Quick Orientation Map (60-120 Seconds) Problem shape: iterative propagation on large graphs (ranking, labeling, distance) Core sentence: rewrite \u0026ldquo;graph traversal\u0026rdquo; as \u0026ldquo;vertex state machine + round-based progression\u0026rdquo; When to use: |V|\u0026gt;=10^6, |E|\u0026gt;=10^7, and you need batch whole-graph computation When to avoid: single point queries, low-latency online path queries (should use query engine) Complexity overview: per-round approximately O(E/P) (P = parallelism), total roughly rounds × per-round cost Common failure: high-degree hubs cause message skew; a slow partition drags barrier time Deep-Dive Focus (PDKH) This article deeply focuses on two concepts via the PDKH ladder:\nSynchronous supersteps and convergence criteria (Pregel/BSP core) Frontier propagation and idempotent aggregation (parallel BFS / CC core) Covered PDKH steps:\nProblem Reframe Minimal Worked Example Invariant Formalization Correctness Sketch Thresholds Failure Mode Engineering Reality Core Concepts 1) Pregel (BSP) Each vertex keeps state state[v] Each superstep reads previous-round messages inbox[v] Computation sends new messages to neighbors Global barrier before next round Core invariant:\nRound t reads only the complete output of round t-1, never half-round intermediate states.\n2) GAS (Gather-Apply-Scatter) Gather: collect contributions from adjacent edges (parallelizable) Apply: update vertex state Scatter: decide which neighbors receive propagation Compared with Pregel’s explicit messaging, GAS is closer to \u0026ldquo;edge computation + vertex aggregation.\u0026rdquo;\n3) Unified Formula View Many graph algorithms can be written as:\nx_v^{(t+1)} = F(x_v^{(t)}, AGG({ M_{u-\u0026gt;v}(x_u^{(t)}, e_{uv}) }))\nVariables:\nx_v^{(t)}: state of vertex v at round t M_{u-\u0026gt;v}: edge propagation function AGG: aggregation operator (sum/min/max) F: state update function When AGG is commutative and associative, parallelization and partitioning become much easier.\nA — Algorithm (Algorithm Problem and Execution Model) Problem Restatement (Engineering Version) Given graph G=(V,E), support in distributed execution:\nPageRank: global importance scores; CC: connected-component labels on undirected graph; BFS(src, hop_limit): level-wise reachability and shortest hop count. Inputs and Outputs Name Type Description V vertex set vertex IDs E edge set adjacency relations P int partition/parallelism max_iter int maximum iteration rounds output1 rank[v] PageRank score output2 label[v] CC label output3 dist[v] BFS distance (INF if unreachable) Minimal Example Graph 0 -\u0026gt; 1,2 1 -\u0026gt; 2 2 -\u0026gt; 3 3 -\u0026gt; 4 4 -\u0026gt; (none) PageRank: mass diffuses along outgoing edges; sink vertices need special handling CC (treated as undirected): all vertices in one component BFS(0): dist=[0,1,1,2,3] C — Concepts (Core Ideas) How Pregel Runs PageRank Per superstep:\nGather (implemented via messages): collect inbound contributions; Apply: rank[v]=(1-d)/N + d*sum(inbox[v]); Scatter: send rank[v]/out_degree[v] to outgoing neighbors. Common convergence criteria:\nL1 delta = Σ|rank_t-rank_{t-1}| \u0026lt; ε or fixed rounds (for example, 20-30) Engineering threshold example:\nAt N=10^8, fixed rounds + sampled validation is common to avoid high overhead of full-delta statistics. How Pregel Runs CC State: label[v] initialized as v.\nPer round, send current minimum label to neighbors and update to the minimum received.\nInvariant:\nlabel[v] is monotonically non-increasing; it can decrease only finitely many times, then stabilizes. This guarantees termination and correctness (on convergence, each connected component reaches one common minimum label).\nWhy Parallel BFS Is Often Layer-Synchronous Parallel BFS is often written as level-synchronous:\nExpand current frontier frontier_t in parallel; Generate frontier_{t+1}; Enter next layer after barrier. Pros: stable semantics and naturally correct shortest hop counts.\nCost: frontier explosion greatly increases communication and deduplication costs.\nEquivalent Implementation in GAS View PageRank: Gather=sum(in-neighbor contribution), Apply=rank update, Scatter=notify if delta large CC: Gather=min(neighbor labels), Apply=take min, Scatter=only on changed vertices BFS: Gather=min(parent_dist+1), Apply=relax, Scatter=on newly activated frontier When the ratio of changed vertices is low, GAS incremental propagation can significantly reduce useless edge scans.\nDeep Dive 1: Synchronous Supersteps and Convergence Criteria (Full PDKH) P — Problem Reframe What we really solve is not \u0026ldquo;how to write PageRank formula,\u0026rdquo; but:\nIn distributed systems, how to ensure each round reads a consistent snapshot, can decide convergence globally, and avoids unbounded tail latency from slow partitions.\nThis is BSP’s value: constrain complex parallel behavior into \u0026ldquo;rounds + barriers + global decidability.\u0026rdquo;\nD — Minimal Worked Example Take a 3-node directed cycle: 0-\u0026gt;1-\u0026gt;2-\u0026gt;0, damping d=0.85, initial rank=[1/3,1/3,1/3].\nRound 1:\nEach node sends 0.3333 to one neighbor Updated rank remains 0.3333 delta = 0 This shows that under full symmetry, one round can stabilize.\nBut with chain 0-\u0026gt;1-\u0026gt;2:\nRound 1: mass shifts toward the tail Round 2: sink (out-degree 0) absorbs mass; without sink-mass handling, total mass leaks This is why sink handling must be explicit in production.\nK — Invariant / Contract Two key contracts in standard PageRank-BSP:\nSnapshot contract: round t+1 reads only completed rank from round t. Mass contract: with sink redistribution, sum(rank)=1 (allowing numerical tolerance around 1e-9). If asynchronous updates are introduced without compensation, contract 1 breaks.\nIf sink handling is omitted, contract 2 breaks.\nH — Formalization and Thresholds Let N=|V|:\nrank_{t+1}(v) = (1-d)/N + d*(sink_t/N + Σ_{u-\u0026gt;v} rank_t(u)/outdeg(u))\nCommon convergence thresholds:\nAbsolute threshold: L1_delta \u0026lt; ε, e.g. ε=1e-6 Relative threshold: L1_delta / N \u0026lt; ε_avg At N\u0026gt;=10^8, common strategy:\nHard cap at 20-30 rounds; sample 0.1% vertices each round for delta monitoring; stop early if sampled delta stays below threshold for 3 consecutive rounds. The core idea is to compress full-monitoring cost into controllable range.\nCorrectness Sketch Preservation: if round t rank is non-negative and sums to 1, round t+1 is also non-negative and preserves sum constraints through non-negative linear combination. Convergence intuition: damping term (1-d) introduces contraction effect; in common norms the iterative mapping is contractive. Termination: stop when threshold or round cap is reached. Failure Mode ε too small: many extra rounds with no business value. Highly imbalanced partitions: even correct operators get dominated by barrier time. Missing dangling correction: continuous score leakage, distorted ranking. Engineering Reality At 16-64 partitions, bottlenecks are often not floating-point operations, but:\ncross-partition message serialization and network replication; barrier waiting for the slowest partition; hotspot vertices saturating one partition’s CPU. So practical optimization order is usually:\npartitioning and hotspot control first; message compression second; convergence threshold tuning last. Deep Dive 2: Frontier Propagation and Idempotent Aggregation (Full PDKH) P — Problem Reframe The essence of parallel BFS/CC is:\nUse minimal state changes to drive next-round propagation, instead of repeatedly scanning the whole graph.\nThis \u0026ldquo;minimal state change\u0026rdquo; is frontier (or active set).\nD — Minimal Worked Example Graph: 0-\u0026gt;[1,2], 1-\u0026gt;[3], 2-\u0026gt;[3], 3-\u0026gt;[4], source 0.\nLayer progression:\nfrontier_0={0} frontier_1={1,2} frontier_2={3} frontier_3={4} Node 3 is discovered from both 1 and 2.\nWithout idempotent dedup (visited bitmap or min aggregation), next-round propagation is duplicated and message volume inflates.\nK — Invariant / Contract Key invariants for parallel BFS:\nThe first write to dist[v] is the shortest hop count; each vertex should enter frontier only once (ignoring idempotent repeats from fault replay). Key invariants for CC:\nLabels are monotonically non-increasing; label[v] always comes from some vertex in the same component; on convergence, labels are equal within component and may differ across components. H — Formalization and Thresholds BFS formalization (layer-synchronous):\ndist_{t+1}(v) = min(dist_t(v), min_{u in frontier_t, (u,v) in E}(dist_t(u)+1))\nCC formalization (minimum-label propagation):\nlabel_{t+1}(v) = min(label_t(v), min_{u in N(v)} label_t(u))\nCommon engineering thresholds:\nhop_limit \u0026lt;= 3/4/6: common in risk propagation and impact analysis; when |frontier_t| / |V| \u0026gt; 0.2, frontier is near full-graph activation and strategy switch is often needed (for example bitmap batching); when cross-partition edge ratio \u0026gt; 35%, frontier broadcast cost rises sharply. Correctness Sketch For BFS:\nLayer synchronization guarantees \u0026ldquo;shorter paths arrive first\u0026rdquo;; once dist[v] is written, later candidates cannot be shorter (they come from same or deeper layers). For CC:\nmin aggregation is idempotent, commutative, and associative, supporting parallel merge; labels only decrease, so finite rounds guarantee stabilization; stabilized state is a constant-label mapping over connected-component equivalence classes. Thresholds and Complexity In sparse graphs (m≈O(n)), early frontiers are often small, so BFS cost can be approximated by local subgraph size.\nIn power-law graphs, if source is near high-centrality vertices, frontier can explode beyond 30% of graph in 1-2 layers.\nSo parallel BFS is not always faster than single-machine BFS:\nIf graph is small or frontier is narrow, distributed scheduling may lose; if graph is large and frontier expands in parallel, distributed gains are significant. Failure Mode Repeated enqueue: without visited/bitmap, messages can blow up exponentially. Incorrect early stop: stopping when one partition sees empty frontier misses active vertices elsewhere. Wrong edge direction use: treating reverse edges as forward in directed graphs changes reachability results. Engineering Reality Real optimization focus for parallel BFS/CC:\nuse bitmap frontier instead of hash set to save 3-10x memory; block-wise send hot adjacency lists to reduce serialization overhead; vertex reindexing improves adjacency access locality and reduces cache misses. These do not change algorithm correctness, but often decide whether runs remain stable.\nFeasibility and Lower-Bound Intuition Why Most Systems Do Not Compute Full Transitive Closure A full reachability matrix takes about O(n^2) space:\nat n=10^6, boolean matrix is roughly 10^12 bits, about 125GB (without index/redundancy) at n=10^7, it directly reaches TB scale and beyond This ignores update-maintenance cost.\nSo industrial systems usually use a two-stage path:\nonline BFS/parallel BFS with hop limit; add reach index or 2-hop labeling on hot subgraphs. When BSP/GAS Is Not Cost-Effective Counterexample scenario:\nquery is only single-source single-target path existence; 99% requests end within 1-2 hops; graph fits in single-machine memory (n\u0026lt;5e6, m\u0026lt;5e7 with enough RAM). Here, heavy distributed iteration is usually worse than optimizing a single-machine query engine.\nPractical Guide / Steps Decide semantics first: strict round consistency (BSP) or more aggressive async (accept non-determinism). Choose aggregation operator: prefer sum/min/max; avoid non-commutative aggregates that create sync bottlenecks. Partition well: place highly connected subgraphs together to reduce cross-partition edge ratio. Add early stop: PageRank uses delta\u0026lt;ε; BFS uses empty frontier or hop_limit. Prevent skew: merge/split messages for high-degree vertices; replicate mirrors if needed. Set budgets: cap per-round message count, active-vertex ratio, and max rounds. Worked Example (Track 2-3 Rounds) Example A: CC Two-Round Convergence Segment Graph (undirected): 0-1-2 and 3-4.\nInitial labels: [0,1,2,3,4]\nAfter round 1: [0,0,1,3,3] After round 2: [0,0,0,3,3] Stable after two rounds: component {0,1,2} has label 0; component {3,4} has label 3.\nExample B: BFS Layer-by-Layer From src=0:\nlayer 0: {0} layer 1: {1,2} layer 2: {3} layer 3: {4} First visit equals shortest hop count because layer synchronization ensures \u0026ldquo;shorter before longer.\u0026rdquo;\nPartition-Level Trace (2 Partitions + Barrier) For production realism, here is a 2-partition round trace.\nPartitioning:\nP0: nodes {0,1,2} P1: nodes {3,4,5} Edges:\nintra-partition: 0-\u0026gt;1, 1-\u0026gt;2, 3-\u0026gt;4, 4-\u0026gt;5 cross-partition: 2-\u0026gt;3 Run parallel BFS (src=0):\nSuperstep 0 P0 active: {0}, sends to 1 P1 active: {} after barrier: frontier_1={1} Superstep 1 P0 active: {1}, sends to 2 P1 active: {} after barrier: frontier_2={2} Superstep 2 (Cross-Partition Round) P0 active: {2}, sends to 3 through cross-partition edge P1 activates 3 after receiving remote message after barrier: frontier_3={3} Superstep 3 P1 active: {3}, sends to 4 P0 idle but still waits at barrier This small example shows two engineering facts:\nCross-partition edges convert local updates into network events; even partitions with no local active vertices must wait at barrier, an inherent BSP cost. Quantifying Communication Cost (Estimate) Let:\nM_t: number of cross-partition messages at round t S_msg: serialized bytes per message B_net: effective network bandwidth (byte/s) Then ideal lower bound of network time for that round is approximately:\nT_net_t \u0026gt;= (M_t * S_msg) / B_net\nIf M_t=5e7, S_msg=16B, B_net=2.5GB/s,\nnetwork transfer lower bound alone is about 0.32s; with deserialization and queuing, actual time is usually much higher.\nThis is why reducing cross-partition messages usually yields more benefit than micro-tuning compute formulas.\nParallel Convergence and Stop Strategies (Production Settings) Recommended PageRank Stop Strategy A common production \u0026ldquo;three-layer stop condition\u0026rdquo;:\niter \u0026gt;= max_iter (hard cap to avoid endless running) global or sampled delta \u0026lt; eps (precision condition) insufficient improvement for consecutive k rounds (benefit condition) Runnable example configuration:\nmax_iter=30 eps=1e-6 early stop if delta improvement \u0026lt; 1% for 3 consecutive rounds This avoids \u0026ldquo;last 10 rounds improve only basis points but consume 40% time.\u0026rdquo;\nRecommended CC Stop Strategy CC commonly uses \u0026ldquo;active set exhausted\u0026rdquo;:\nrecord changed-label vertices per round as A_t terminate when A_t=0 For large graphs, add safety guard:\nif A_t/|V| \u0026lt; 1e-6 for 2 consecutive rounds, run one full validation and stop Recommended BFS Stop Strategy frontier empty: natural termination reach hop_limit: business-driven termination (for example, risk control checks only 4 hops) hit target: single-target query can early-stop Note: in distributed systems, early stop must be globally coordinated; a single partition cannot decide alone.\nFault Recovery and Idempotence (Must Consider) In distributed environments, failure is normal rather than exceptional.\nWithout idempotence, retries can corrupt results.\nPageRank Idempotence Concerns replaying same-round messages causes duplicate accumulation; deduplicate by round ID or use recomputable round snapshots. rollback usually goes to latest superstep checkpoint, not patch-style fixes. CC/BFS Idempotence Concerns min aggregation is naturally idempotent: duplicate messages do not worsen minima; if BFS uses \u0026ldquo;first successful dist write\u0026rdquo; as atomic condition, duplicates are safely discarded. This is why many systems prefer sum/min/max:\nnot only parallel-friendly, but also more fault-tolerant.\nCorrectness (Proof Sketch) CC Invariant: label[v] is always some vertex ID inside its component, and is monotonically non-increasing. Preservation: each round only takes smaller labels, never increases. Termination: finite integer monotone descending sequence must terminate. Correctness: minimum label propagates within each connected component; with no cross-component edges, labels do not mix. Layer-Synchronous BFS Invariant: frontier at round k contains exactly nodes with distance k from source. Preservation: expansion only from frontier k to unvisited nodes, labeled k+1. Termination: frontier empty or hop cap reached. Correctness: first-visit level equals shortest hop count. Complexity Let n=|V|, m=|E|, T=iteration rounds, P=parallelism.\nPageRank: about O(T * m / P), space O(n + m/P) (including partition-edge cache) CC: worst case O(D * m / P), where D is upper bound of label-propagation rounds Parallel BFS: per layer approximately O(m_active/P), total roughly one pass over edges What matters most is not Big-O itself, but:\ncross-partition edge ratio; per-round barrier waiting; active-vertex ratio curve. Constant Factors and Engineering Realities Barrier cost: BSP waits for slowest partition each round; tail tasks determine latency. Message amplification: high-degree vertices can amplify one update to millions of messages. Cache locality: CSR sequential scans are usually better than random adjacency access. Dedup cost: BFS next_frontier without bitmap/bucketing causes huge shuffle pressure. Convergence monitoring: exact global delta is costly at very large scale; sampled monitoring + round caps is practical. Runnable Example (Python) from collections import deque def pagerank_bsp(adj, d=0.85, max_iter=30, eps=1e-8): n = len(adj) rank = [1.0 / n] * n out_deg = [len(nei) for nei in adj] for _ in range(max_iter): inbox = [(1.0 - d) / n for _ in range(n)] sink_mass = 0.0 for u in range(n): if out_deg[u] == 0: sink_mass += rank[u] continue share = d * rank[u] / out_deg[u] for v in adj[u]: inbox[v] += share if sink_mass \u0026gt; 0: extra = d * sink_mass / n for v in range(n): inbox[v] += extra delta = sum(abs(inbox[i] - rank[i]) for i in range(n)) rank = inbox if delta \u0026lt; eps: break return rank def cc_label_propagation_undirected(adj, max_iter=100): n = len(adj) label = list(range(n)) for _ in range(max_iter): changed = False new_label = label[:] for v in range(n): best = label[v] for u in adj[v]: if label[u] \u0026lt; best: best = label[u] if best \u0026lt; new_label[v]: new_label[v] = best changed = True label = new_label if not changed: break return label def bfs_level_sync(adj, src, hop_limit=None): n = len(adj) dist = [-1] * n dist[src] = 0 frontier = [src] level = 0 while frontier: if hop_limit is not None and level \u0026gt;= hop_limit: break next_frontier = [] for u in frontier: for v in adj[u]: if dist[v] == -1: dist[v] = level + 1 next_frontier.append(v) frontier = next_frontier level += 1 return dist if __name__ == \u0026#34;__main__\u0026#34;: directed = [[1, 2], [2], [3], [4], []] undirected = [[1], [0, 2], [1], [4], [3]] pr = pagerank_bsp(directed, max_iter=50) cc = cc_label_propagation_undirected(undirected) dist = bfs_level_sync(directed, src=0, hop_limit=4) print(\u0026#34;PageRank:\u0026#34;, [round(x, 6) for x in pr]) print(\u0026#34;CC labels:\u0026#34;, cc) print(\u0026#34;BFS dist:\u0026#34;, dist) Run:\npython3 graph_compute_demo.py E — Engineering (Production Scenarios) Scenario 1: Offline PageRank for Recommendation Graphs Background: candidate-pool weights are refreshed daily on graphs around 10^8 edges. Why BSP: synchronous rounds + fixed convergence criteria, stable and replayable outputs. Key optimizations: sink-mass aggregation, in-partition combiners, sampled delta monitoring. Scenario 2: CC Clustering for Risk Graphs Background: identify gangs/device clusters with explainable labels. Why label-propagation CC: min aggregation is idempotent and easy to recover under failure. Key optimization: propagate only vertices with label changes to reduce useless messaging. Scenario 3: Parallel BFS for k-hop Propagation Background: account risk diffusion and call-chain impact analysis. Why layer sync: shortest-hop semantics are naturally correct and easy to constrain by hop_limit. Key optimization: frontier bitmap + vertex reindexing to reduce shuffle and random access. Alternatives and Tradeoffs Strategy Pros Cons Best-fit range Pregel/BSP Clear semantics, stable output High barrier overhead Offline batch, replay-critical tasks GAS (synchronous) Edge-friendly, unified expression Framework complexity Mixed algorithm platforms Async graph compute Potential faster convergence Non-deterministic, harder debugging Iterative tasks with low consistency demand Single-machine traversal Simple development Lower memory/throughput ceiling Prototype phase around m \u0026lt;= 10^7 Why prioritize Pregel/GAS here:\nYou care about production execution of PageRank/CC/BFS rather than one-off point queries; all three map well to \u0026ldquo;aggregatable iterative propagation\u0026rdquo;; synchronous models are easier for SLA and regression alignment. Validation and Benchmark Checklist (Must Run Before Rollout) Algorithm-only without validation is risky in production.\nSplit validation into correctness, stability, and cost.\n1) Correctness Validation PageRank: verify sum(rank) is near 1 (for example, error \u0026lt;1e-6). CC: sample edges (u,v) and verify equal labels for same-component vertices. BFS: sample nodes and compare dist against single-machine baseline. Use two datasets:\nsmall graph (n\u0026lt;=1e4) for manual traceability; medium graph (n≈1e6) for parallel-vs-single-machine consistency. 2) Stability Validation Run same input 5 times and observe output drift (especially async mode). Inject partition failures and verify checkpoint recovery continues convergence. Stress with partition counts P=8/16/32/64 and check long-tail behavior. Recommended key metrics:\nper-round duration t_iter_p50/p95 barrier wait ratio active vertex ratio curve A_t/|V| 3) Cost Validation cross-partition message volume (per round and total) peak memory (frontier, inbox, adjacency cache) per-round network sent bytes Empirically, if you see:\nbarrier time \u0026gt; 35% of round total time cross-partition messages \u0026gt; 50% of total messages then optimize partition strategy first, not algorithm micro-parameters.\n4) Regression Baseline Recommendation Keep a replayable baseline for each task:\nfixed input snapshot ID fixed parameters (d, eps, max_iter, hop_limit) fixed partition strategy version This lets each optimization clearly answer:\ntrue algorithm/accuracy improvement; or fake improvement from system noise. Migration Path After this article, continue in order:\nJoin-based Graph Query (Expand/Filter/Join executor) Subgraph matching (VF2 + pruning) Dynamic graph incremental computation (local recomputation after edge updates) Graph indexing (2-hop labeling / reach index) 30-Second Selection Decision Tree (Directly Reusable) For graph platform selection, start with these four questions:\nMust results be strictly reproducible?\nYes: prefer synchronous BSP/Pregel; no: evaluate async engines.\nIs this a whole-graph iterative task?\nYes: PageRank/CC use GAS or Pregel;\nNo: for point queries, use query engine rather than distributed iteration.\nIs active-vertex ratio consistently below 5%?\nYes: prefer incremental propagation (scatter only changed vertices);\nNo: full-edge scans may be more stable.\nAre cross-partition edges above 40%?\nYes: repartition first, then tune algorithms;\nNo: then tune thresholds, compression, and operators.\nCore value of this tree is fixing optimization order:\narchitecture and partitioning first, execution model second, algorithm parameters last.\nFAQ and Caveats Must PageRank run to very small eps?\nNot always. Online workloads often use \u0026ldquo;fixed rounds + sampled checks\u0026rdquo; to balance cost and stability.\nCan CC run asynchronously?\nYes, but reproducibility degrades and debugging gets harder; clarify business tolerance first.\nWhere does parallel BFS explode most often?\nHigh-degree nodes can trigger frontier explosion, making dedup and communication dominant bottlenecks.\nWhy not compute full transitive closure directly?\nStorage is near O(n^2), almost unacceptable at million-scale vertices.\nWhich parameter should be tuned first?\nRecommended order: partition -\u0026gt; round cap -\u0026gt; early-stop threshold -\u0026gt; message compression.\nDo not tune only eps first; common outcome is slower runs with little gain.\nHow to set BFS hop_limit?\nSet hard boundary from business semantics first, then evaluate recall gain from historical data.\nFor example, risk propagation commonly starts at k=3, then compare marginal value of k=4/5 vs extra cost.\nWhen should synchronous be replaced by asynchronous?\nOnly after confirming business can accept non-determinism and barrier waiting is truly dominant (for example \u0026gt;40%).\nBest Practices and Recommendations Structure algorithms as \u0026ldquo;state + aggregation + propagation\u0026rdquo; for implementation unification. Every iterative task should define hard stop conditions (round/budget/time window). Prefer idempotent aggregations (sum/min/max) for better fault tolerance and retry stability. Apply dedicated handling for high-degree vertices (mirrors, replicas, message merge). Monitor at least: active-vertex ratio, cross-partition message volume, per-round p95 latency. Preserve replay outputs with same input/params after each optimization to avoid confusing noise with progress. R — Reflection The most common error in these tasks is treating \u0026ldquo;formula correctness\u0026rdquo; as \u0026ldquo;system readiness.\u0026rdquo;\nWhat truly determines production quality:\nwhether model semantics are reproducible; whether rounds and communication are budgetable; whether skew and failure recovery have concrete plans. Pregel and GAS provide an engineering abstraction boundary, not one standalone algorithm.\nS — Summary Pregel (BSP) fits offline graph computation requiring determinism and replayability. GAS fits algorithm families expressible as \u0026ldquo;edge contribution -\u0026gt; vertex update -\u0026gt; selective propagation.\u0026rdquo; PageRank, CC, and parallel BFS all reduce to \u0026ldquo;aggregation + iterative state update.\u0026rdquo; Parallel performance ceiling is usually set by communication skew and barriers, not formula complexity. To run graph algorithms stably, design stop conditions, budgets, and monitoring before optimization tricks. In real systems, gains usually come from reducing cross-partition messages and controlling active frontiers, not from 5% operator micro-tuning. Every optimization should be paired with regression validation and versioned baselines. References and Further Reading Pregel: A System for Large-Scale Graph Processing (Google, 2010) PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs (OSDI 2012) GraphX: Unifying Data-Parallel and Graph-Parallel Analytics Neo4j Graph Data Science docs (PageRank / WCC) Apache Spark GraphX / GraphFrames official docs Call to Action (CTA) Start with one existing graph job and do one \u0026ldquo;model rewrite\u0026rdquo;:\nexpress the job as state + aggregation + propagation; define clear round stop conditions; record active-vertex ratio and cross-partition message volume per round. After these three steps, you will clearly see whether your bottleneck is algorithm, partitioning, or execution model.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/110-graph-computation-models-pregel-gas-parallel-bfs/","summary":"A systematic walkthrough of Pregel (BSP) and GAS (Gather-Apply-Scatter), focused on execution paths, convergence strategies, and engineering trade-offs for PageRank, Connected Components, and parallel BFS.","title":"Practical Graph Computation Models: How Pregel (BSP) and GAS Run PageRank/CC/Parallel BFS"},{"content":" Subtitle / Abstract\nGraph partitioning is not a minor offline preprocessing trick. It is a major production performance lever in graph databases: partition incorrectly, and both query latency and network traffic go out of control. Using the ACERS template, this article explains the trade-offs of Edge-cut vs Vertex-cut, the multilevel intuition behind METIS, and the metrics that actually matter in engineering.\nEstimated reading time: 18-22 minutes Tags: graph partitioning, Edge-cut, Vertex-cut, METIS SEO keywords: Graph Partitioning, Edge-cut, Vertex-cut, METIS, Query Latency Meta description: From objective functions to engineering metrics, understand how graph partitioning affects query latency and network communication, with runnable code and tuning steps. Target Audience Backend engineers building graph databases, graph computing platforms, risk-control graphs, or recommendation graphs Performance engineers who need to diagnose \u0026ldquo;slow queries\u0026rdquo; at the partitioning layer Algorithm engineers who want to move from concept-level understanding to production implementation Background / Motivation In relational databases, performance is often improved with indexes, join reordering, and cache hit optimization. In graph databases, cross-machine edges are often the first bottleneck.\nWhen a query path frequently crosses partitions, it triggers:\nRemote RPC round trips (RTT) Remote subgraph fetching and deserialization Multi-partition coordination and result merge overhead So in production, graph partitioning directly impacts two core metrics:\nQuery latency (p95/p99) Network communication volume (bytes/s, cross-partition messages) In one sentence: in production graph databases, partitioning algorithms are not optional optimization, they are a foundational capability.\nCore Concepts Graph Partitioning: split a graph into k partitions while minimizing inter-partition coupling and maintaining load balance. Edge-cut: minimize the number of cross-partition edges, with each vertex assigned to a single partition. Vertex-cut: partition by edges and allow vertices to be replicated across partitions; useful for reducing skew caused by hot edges. Balance Constraint: partition load must not skew too much; common constraints include |V_i| \u0026lt;= (1+ε)|V|/k or edge-load constraints. METIS (core idea): a multilevel flow (Coarsen -\u0026gt; Initial Partition -\u0026gt; Uncoarsen + Refine) that reduces global search cost by \u0026ldquo;coarsen first, refine later.\u0026rdquo; Quick Orientation Map (60-120 Seconds) Problem shape: split a large graph into k partitions while minimizing cross-machine access and keeping load balanced. One-line core: choose an objective first (Edge-cut/Vertex-cut), then solve an initial partition with a multilevel method and do incremental corrections. When to use / avoid: static or slowly changing graphs fit offline baseline partitioning; high-frequency dynamic graphs need incremental rebalancing support. Complexity glance: optimal partitioning is combinatorially hard; engineering relies on approximate algorithms plus a monitoring feedback loop. Common failure mode: optimizing only cut while ignoring balance can make p99 worse. Master Mental Model Core abstraction: graph partitioning is a constrained graph-cut optimization problem. Problem family: combinatorial optimization + local search + multi-objective trade-offs (communication, latency, load). Isomorphism with known templates: Offline stage resembles multilevel coarse-to-fine optimization. Online stage resembles local hill-climbing with budget-limited migrations. Feasibility and Lower-Bound Intuition For densely connected graphs without clear community boundaries, the theoretical lower bound of cut will not be very low. When query templates inherently cross communities (for example, cross-domain risk-control paths), even perfect partitioning cannot reduce cross-machine access to zero. Under strong power-law degree distributions (a few ultra-high-degree nodes), pure Edge-cut faces a hotspot lower bound: You may reduce cut, but it is hard to flatten hotspots at the same time. Counterexample:\nIf one supernode connects to 100k edges and traffic concentrates around it, forcing strict non-replication leads to severe skew in one partition. In this case, Vertex-cut is often more realistic than Edge-cut.\nProblem Modeling and Constraint Scale In practice, explicitly decompose the objective:\n\\[ \\text{Score} = \\alpha \\cdot \\text{CutCost} + \\beta \\cdot \\text{ImbalanceCost} + \\gamma \\cdot \\text{HotspotCost} \\]\nWhere:\nCutCost: number of cross-partition edges or weighted cross-edge sum ImbalanceCost: penalty for deviation from target partition capacity HotspotCost: local congestion penalty caused by hot vertices or edges α,β,γ: business weights (derived from SLA) Scale recommendations (starting points, not hard standards):\nTens of millions of vertices and hundreds of millions of edges: prioritize offline multilevel partitioning + periodic recalibration As partition count k increases: check network bottlenecks first, then single-machine bottlenecks; avoid blindly increasing partitions Scan ε (load slack) typically from 0.03 to 0.10 A — Algorithm (Problem and Algorithm) Problem Restatement (Engineering Form) Given a large graph G=(V,E), split it into k partitions such that:\nPartition load is as balanced as possible; Frequently traversed query edges stay inside partitions; Network communication is minimized; Hot vertices are handled with controllable strategy (avoid blowing up one machine). Inputs and Outputs Name Type Description G graph Production graph (optionally weighted) k int Number of partitions obj enum Objective: Edge-cut or Vertex-cut constraint config Load-balance threshold, hotspot threshold return part(v) / part(e) Mapping of vertex or edge to partition Example (8 Nodes, 2 Partitions) Community A: 0-1-2-3-0 Community B: 4-5-6-7-4 Bridge edges: (1,4), (2,5), (3,6) If cut by communities: P0={0,1,2,3}, P1={4,5,6,7}, Edge-cut = 3 If cut randomly: Edge-cut is often \u0026gt;= 6 This is exactly where \u0026ldquo;query latency can differ by more than 2x\u0026rdquo;: more cross-partition edges make distributed query loops far more likely.\nDeriving the Approach (From Brute Force to Practical) Naive Brute Force Enumerate all partition assignments, then compute cut and balance Exponential complexity, not deployable Key Observations Production graphs are usually sparse but huge, so we need approximate optimality rather than global optimality Most gains come from: reducing cross-partition edges avoiding hot partitions Algorithm names are not first priority; objective + constraints + metric feedback loop are. Method Selection Edge-cut track: common for OLTP graph queries, short paths, k-hop lookup Vertex-cut track: more stable when ultra-high-degree vertices are obvious (celebrity vertices, super accounts) METIS idea: one of the industrial default choices for offline baseline partitioning C — Concepts (Core Ideas) 1) Edge-cut vs Vertex-cut Edge-cut (Unique Vertex Ownership) Objective (simplified):\n[ \\min \\sum_{(u,v)\\in E} [part(u) \\neq part(v)] ]\nPros: intuitive model, simple query routing Cons: supernodes can pull large numbers of edges into cross-partition traffic Vertex-cut (Edge Ownership, Vertex Replication) A common metric is replication factor:\n[ RF = \\frac{1}{|V|}\\sum_{v\\in V} |A(v)| ]\nWhere A(v) is the partition set containing vertex v. Lower RF is better.\nPros: can spread high-degree vertex edges across machines Cons: more complex replica consistency and read/write path 2) METIS Multilevel Intuition (Must Understand) The core of METIS is not one magical formula, but a three-stage flow:\nCoarsening: heavy-edge matching to shrink the graph Initial Partition: quickly generate an initial split on the small graph Uncoarsen + Refine: project back layer by layer and reduce cut with FM/KL-like local optimization Engineering value: turn one huge hard problem into many small corrections, usually more stable than greedy partitioning on the original graph.\nDeepening Focus (PDKH) This article deeply expands two concepts:\nConcept A: Mapping Edge-cut objective to query latency Concept B: METIS multilevel partition workflow Concept A: Edge-cut -\u0026gt; Latency Problem Reframe: partition quality essentially compresses cross-machine hops. Minimal Example: under the same query template, Edge-cut=3 vs Edge-cut=7 can roughly double cross-machine requests. Invariant: under valid load constraints, reducing cross-partition edges does not increase expected remote hops. Formalization: latency ≈ local_cpu + remote_rtt * cross_hops + deserialize_cost cross_hops is highly correlated with cut ratio. Correctness Sketch: with a fixed query template, fewer cross-partition edges means fewer boundary events triggering remote access. Threshold: when cut_ratio \u0026gt; 0.25, many online graph queries show clear p99 degradation (empirical threshold; calibrate per business). Failure Mode: minimizing cut without load control creates hot partitions and may reduce overall throughput. Engineering Reality: cut must be evaluated together with partition load and hotspot degree distribution; do not drive by one metric. Concept B: METIS Multilevel Flow Problem Reframe: not \u0026ldquo;solve once,\u0026rdquo; but \u0026ldquo;coarsen, solve coarse, then refine layer by layer.\u0026rdquo; Minimal Example: shrink a 10M-edge graph to 200k edges, partition, then replay refinements. Invariant: each refinement accepts only migrations that reduce objective or keep constraints balanced. Formalization: Coarsen -\u0026gt; Partition -\u0026gt; Uncoarsen/Refine. Correctness Sketch: not globally optimal, but monotonic local improvements ensure non-degrading objective. Threshold: larger graphs and clearer community structure usually yield more stable multilevel gains. Failure Mode: if graph changes too fast, offline partitioning ages quickly and gains decay. Engineering Reality: must pair with incremental rebalance (periodic repartition + hotspot migration). Practical Guide / Steps Define objective first: choose Edge-cut or Vertex-cut before choosing algorithm names. Define constraints: partition capacity, hotspot threshold, migration budget. Get offline baseline partition: use METIS-style multilevel ideas. Observe online metrics: cut_ratio, RF, p95/p99, cross-partition bytes. Do local rebalance: migrate in small steps based on hotspot and cross-edge contribution; avoid full repartition. Run regression validation: benchmark representative query templates, not only one-time batch stats. Selection Guide Choose objective by degree distribution: Smooth degree distribution: try Edge-cut first. Strong power-law: prioritize Vertex-cut evaluation. Choose objective by query type: Shortest path / local subgraph reads: Edge-cut is easier for routing optimization. Batch traversal / message propagation: Vertex-cut is often more stable under hotspot pressure. Choose strategy by machine memory: Tight memory: reduce replication, use Vertex-cut cautiously. Relatively ample memory: replication can be used to trade for throughput stability. Choose cadence by migration budget: Low migration budget: local incremental correction. Acceptable maintenance window: offline repartition + incremental backfill. Runnable Example (Python) Below is a runnable local-search example with \u0026ldquo;balance constraints + cut cost\u0026rdquo; (for understanding objectives, not a full METIS implementation):\nfrom collections import defaultdict from typing import Dict, List, Tuple Edge = Tuple[int, int] def edge_cut(edges: List[Edge], part: Dict[int, int]) -\u0026gt; int: return sum(1 for u, v in edges if part[u] != part[v]) def partition_sizes(part: Dict[int, int], k: int) -\u0026gt; List[int]: sizes = [0] * k for node in part: sizes[part[node]] += 1 return sizes def greedy_balanced_partition( nodes: List[int], edges: List[Edge], k: int, max_imbalance: float = 0.10, max_iter: int = 20, ) -\u0026gt; Dict[int, int]: part = {node: node % k for node in nodes} limit = int((1.0 + max_imbalance) * len(nodes) / k) + 1 adj = defaultdict(list) for u, v in edges: adj[u].append(v) adj[v].append(u) for _ in range(max_iter): improved = False sizes = partition_sizes(part, k) for node in nodes: current = part[node] best_part = current best_gain = 0 for candidate in range(k): if candidate == current: continue if sizes[candidate] + 1 \u0026gt; limit: continue # Estimate cut change if node is moved (positive means lower cut). gain = 0 for nei in adj[node]: before_cross = 1 if part[nei] != current else 0 after_cross = 1 if part[nei] != candidate else 0 gain += (before_cross - after_cross) if gain \u0026gt; best_gain: best_gain = gain best_part = candidate if best_part != current: sizes[current] -= 1 sizes[best_part] += 1 part[node] = best_part improved = True if not improved: break return part def main() -\u0026gt; None: nodes = list(range(8)) edges = [ (0, 1), (1, 2), (2, 3), (3, 0), (4, 5), (5, 6), (6, 7), (7, 4), (1, 4), (2, 5), (3, 6), ] k = 2 init_part = {node: node % k for node in nodes} init_cut = edge_cut(edges, init_part) opt_part = greedy_balanced_partition(nodes, edges, k=k) opt_cut = edge_cut(edges, opt_part) print(\u0026#34;init part:\u0026#34;, init_part, \u0026#34;cut=\u0026#34;, init_cut) print(\u0026#34;opt part :\u0026#34;, opt_part, \u0026#34;cut=\u0026#34;, opt_cut) if __name__ == \u0026#34;__main__\u0026#34;: main() Run:\npython3 graph_partition_demo.py Runnable Example 2: Vertex-cut Replication Factor Estimation from collections import defaultdict from typing import Dict, List, Tuple Edge = Tuple[int, int] def replication_factor(edges: List[Edge], edge_part: Dict[Edge, int], n_nodes: int) -\u0026gt; float: node_parts = defaultdict(set) for (u, v), p in edge_part.items(): node_parts[u].add(p) node_parts[v].add(p) total = sum(len(node_parts[node]) if node in node_parts else 1 for node in range(n_nodes)) return total / n_nodes def main() -\u0026gt; None: # Simplified example: 3 partitions. edges = [(0, 1), (0, 2), (0, 3), (4, 5), (5, 6), (6, 7), (3, 4)] edge_part = { (0, 1): 0, (0, 2): 1, (0, 3): 2, (4, 5): 1, (5, 6): 1, (6, 7): 1, (3, 4): 2, } rf = replication_factor(edges, edge_part, n_nodes=8) print(\u0026#34;replication factor =\u0026#34;, round(rf, 3)) if __name__ == \u0026#34;__main__\u0026#34;: main() This example is meant to visualize RF trend shifts: on the same graph, node replication overhead can differ significantly under different partitioning strategies.\nExplanation and Principles (Why This Works) The key value of this example is to make partition quality measurable:\nYou can directly observe how much cut is reduced; You can add business query weights to give critical edges higher importance; You can tighten balance constraints and observe the latency/throughput turning point. In real production, METIS performs a more systematic \u0026ldquo;coarsen + replay refinement\u0026rdquo; at larger scales, but the underlying idea is still:\nHave an objective; Have constraints; Have observable metric feedback. Worked Example (Trace) Below is a simplified trace of whether a partition migration is worthwhile:\nInitial: cut_ratio = 0.29, p99 = 410ms, cross_bytes = 1.8GB/min Candidate migration: move a 20k-node subgraph from P3 to P5 Estimated benefit: cut_ratio -\u0026gt; 0.23, P5 CPU +5%, P3 CPU -8% Observed after execution:\nHour 1: p99 drops to 330ms, cross_bytes drops to 1.3GB/min Hour 6: P5 load stabilizes, no hotspot alarm Hour 24: peak p99 remains stable at 300~320ms Conclusion: if load stays within threshold after migration, reducing cross edges usually yields stable latency gains.\nCorrectness (Proof Sketch) This does not prove global optimality; it proves monotonic improvement of local migration strategy:\nInvariant: every migration must satisfy capacity and hotspot constraints. Preservation: accept migration only if Score decreases (or equal score but more stable). Termination: local search stops when no candidate migration can further reduce Score. So you get at least a constraint-satisfying local optimum, rather than uncontrolled random fluctuation.\nComplexity and Thresholds Offline multilevel methods are often approximately linear to sublinear scalable (depends on implementation and graph structure). Online local migration cost per round depends on candidate set size |C| and incremental evaluation cost. In engineering, thresholds matter more than Big-O: Migration window per round (for example, 5-15 minutes) Migration budget per round (for example, at most 0.5% of vertices) Rollback threshold (for example, rollback if p99 rises for 5 consecutive minutes) Constant Factors and Engineering Reality Serialization cost: cross-machine edges force object decode, often with high constant factors. Cache locality: concentrated local subgraphs can significantly affect the upside via cache hit rate. Batch window: if offline repartitioning exceeds maintenance windows, theoretical gains can be nullified. Replica consistency: Vertex-cut write paths are more complex; be careful in mixed read-write workloads. Production Troubleshooting Checklist (Required) After partition rollout, do not stop at \u0026ldquo;average latency decreased.\u0026rdquo; Run a 24-hour replay checklist:\nDo the four core metrics improve in the same direction?\nAre p95/p99 lower? Are cross-partition bytes lower? Are cut_ratio or RF moving toward target? Did per-partition CPU/memory stay under alert thresholds? Is distribution being hidden by averages?\nDid top 10 slow query templates actually improve? Did long-tail queries regress? Is trend consistent during peak traffic windows? Are migration side effects controlled?\nAny write jitter during migration windows? Did cache hit rate dip below guardrails? Has rollback script been drill-tested (at least once)? Did hotspot partitions drift?\nIs today’s hottest partition the same as yesterday’s? Did hotspot just move from one machine to another? Is a dedicated hot-vertex strategy needed? Are capacity boundaries exposed early?\nUnder 7-day edge growth forecast, can current k still hold? Will replication-factor growth break memory budget? Should partition expansion windows be reserved early? To make troubleshooting reusable, log partition changes in structured form:\n{ \u0026#34;change_id\u0026#34;: \u0026#34;part-2026-02-09-01\u0026#34;, \u0026#34;strategy\u0026#34;: \u0026#34;edge_cut_with_balance\u0026#34;, \u0026#34;before\u0026#34;: {\u0026#34;cut_ratio\u0026#34;: 0.27, \u0026#34;p99_ms\u0026#34;: 380, \u0026#34;cross_bytes_mb_min\u0026#34;: 1540}, \u0026#34;after\u0026#34;: {\u0026#34;cut_ratio\u0026#34;: 0.21, \u0026#34;p99_ms\u0026#34;: 305, \u0026#34;cross_bytes_mb_min\u0026#34;: 1090}, \u0026#34;risk\u0026#34;: {\u0026#34;hot_partition_cpu_max\u0026#34;: 0.72, \u0026#34;rollback_ready\u0026#34;: true} } This structured record is critical for postmortems: it answers \u0026ldquo;why it worked,\u0026rdquo; \u0026ldquo;whether it is repeatable,\u0026rdquo; and \u0026ldquo;how to do it more safely next time.\u0026rdquo;\nMetric Definitions (Avoid Team Misalignment) Many partition discussions fail not because algorithms are weak, but because metric definitions differ. Standardize these:\ncut ratio\nDefinition: #cross-partition edges / #total edges Definition scope: report both on \u0026ldquo;active-subgraph edges\u0026rdquo; and \u0026ldquo;full-graph edges\u0026rdquo; separately cross-partition bytes\nDefinition: total network bytes from cross-partition requests Scope: split read and write paths; read-heavy workloads should prioritize read-path metrics partition hotspot index\nDefinition: max_partition_qps / avg_partition_qps Scope: compute on both 1-minute and 5-minute windows to capture jitter and trend replication factor (Vertex-cut only)\nDefinition: average number of replicas per vertex Scope: separately compute on online-active vertices to avoid risk dilution by cold data With fixed definitions for these four metrics, partition optimization becomes an auditable engineering process instead of intuition debate.\nReplay Benchmark Template (Python) import csv import statistics from dataclasses import dataclass from typing import List @dataclass class QuerySample: template: str latency_ms: float cross_bytes: int cross_hops: int def load_samples(path: str) -\u0026gt; List[QuerySample]: result: List[QuerySample] = [] with open(path, \u0026#34;r\u0026#34;, encoding=\u0026#34;utf-8\u0026#34;) as f: reader = csv.DictReader(f) for row in reader: result.append( QuerySample( template=row[\u0026#34;template\u0026#34;], latency_ms=float(row[\u0026#34;latency_ms\u0026#34;]), cross_bytes=int(row[\u0026#34;cross_bytes\u0026#34;]), cross_hops=int(row[\u0026#34;cross_hops\u0026#34;]), ) ) return result def p99(values: List[float]) -\u0026gt; float: if not values: return 0.0 values_sorted = sorted(values) idx = int(0.99 * (len(values_sorted) - 1)) return values_sorted[idx] def summarize(samples: List[QuerySample]) -\u0026gt; None: latency = [item.latency_ms for item in samples] cross_bytes = [item.cross_bytes for item in samples] cross_hops = [item.cross_hops for item in samples] print(\u0026#34;count =\u0026#34;, len(samples)) print(\u0026#34;avg_latency_ms =\u0026#34;, round(statistics.mean(latency), 2)) print(\u0026#34;p99_latency_ms =\u0026#34;, round(p99(latency), 2)) print(\u0026#34;avg_cross_bytes =\u0026#34;, int(statistics.mean(cross_bytes))) print(\u0026#34;avg_cross_hops =\u0026#34;, round(statistics.mean(cross_hops), 3)) if __name__ == \u0026#34;__main__\u0026#34;: baseline = load_samples(\u0026#34;baseline.csv\u0026#34;) candidate = load_samples(\u0026#34;candidate.csv\u0026#34;) print(\u0026#34;baseline\u0026#34;) summarize(baseline) print(\u0026#34;candidate\u0026#34;) summarize(candidate) This script is suitable for minimal before/after replay comparison: same templates, same inputs, unified metrics, no guesswork conclusions.\nE — Engineering (Applications) Scenario 1: Online Graph Queries (Edge-cut Dominant) Problem: high p99 on k-hop/path queries.\nApproach: prioritize reducing cross-partition edges along common query boundaries while maintaining load balance.\nBenefit: fewer cross-machine hops, more stable p95/p99.\nGoal: cut_ratio from 0.31 -\u0026gt; 0.18 Result: path query p99 from 420ms -\u0026gt; 230ms (example measurement) Scenario 2: Supernode Graphs (Vertex-cut More Stable) Problem: a few vertices have extremely high out-degree, causing severe single-machine hotspots under Edge-cut.\nApproach: edge partitioning + controlled vertex replication with RF monitoring.\nBenefit: spreads hotspot writes/traversals across partitions.\nScenario 3: Sharding and Capacity Planning (METIS Baseline + Incremental Migration) Problem: full repartition is expensive; business cannot tolerate frequent downtime migrations.\nApproach: periodically recompute offline baseline, migrate only high-benefit candidate subgraphs online.\nBenefit: continuously improve partition quality within migration budget.\nR — Reflection (Deep Dive) Complexity and Engineering Cost Partitioning is combinatorially hard; pursuing global optimum is unrealistic. Engineering should focus on sustainable optimization loops: baseline exists monitoring exists incremental repair exists Alternatives and Trade-offs Strategy Pros Cons Best for Edge-cut Simple query routing Supernodes can hotspot OLTP graph queries Vertex-cut Better hotspot control Replica consistency complexity Power-law graphs Random sharding Simple implementation High communication cost Early PoC only Quantitative Comparison (Example) Metric Strategy A (random) Strategy B (Edge-cut) Strategy C (Vertex-cut) cut ratio 0.34 0.19 0.22 RF 1.00 1.00 1.38 query p99 480ms 260ms 290ms network bytes 2.1GB/min 1.2GB/min 1.0GB/min Interpretation:\nEdge-cut is cleaner on read paths; Vertex-cut can be better for hotspot and network bytes, but requires replica-management cost; Real choice depends on read/write ratio and consistency requirements. Common Pitfalls Looking only at algorithm names, not objective functions: often leads to \u0026ldquo;great theory, poor production metrics.\u0026rdquo; Optimizing cut only, ignoring balance: latency drops but throughput collapses. One-time offline partition only: effects naturally decay in dynamic-graph settings. Counterexample (Must Remember) Suppose you place all hot vertices into one partition to reduce cut. Communication drops short-term, but that partition’s CPU spikes and queuing worsens p99.\nThis shows: partition optimization is multi-objective, not single-objective extreme optimization.\nFAQ and Caveats Can METIS directly solve online dynamic repartitioning?\nNo. METIS is better as an offline baseline; online operation requires incremental migration strategies.\nIs Edge-cut always better than Vertex-cut?\nNo. Under extreme high-degree imbalance, Vertex-cut is often more stable.\nHow to decide when to repartition?\nWatch trends, not points: rising cut_ratio, rising cross-partition bytes, rising p99 over sustained windows.\nHow to choose partition count k?\nSet an upper bound by machine budget first, then benchmark the joint curve of k vs p95/p99 and communication volume to find turning points.\nBest Practices and Recommendations Define main query paths first, then define partition objective Use business-weighted edges, not uniform edge weights Set migration budget ceilings per batch to avoid global jitter Monitor cut_ratio, RF, p99, and network bytes together; avoid single-metric decisions Prepare dedicated strategies for hot vertices (replication, side indexes, or caching) Migration Path (Skill Ladder) After mastering this article, progress in this order:\nDynamic incremental partitioning: migrate only high-benefit local subgraphs Query-aware partitioning: include query logs in partition weight modeling Multi-tier graph storage coordination: co-optimize partitioning with hot/cold tiering and cache strategy Online A/B validation framework: make partition strategies rollbackable, comparable, and auditable S — Summary Graph partitioning directly determines latency ceiling and network cost in graph databases. Edge-cut and Vertex-cut have no universal winner; workload shape decides. METIS’s core value is \u0026ldquo;multilevel scaling + local refinement,\u0026rdquo; not one-shot global optimality. A production-ready partition strategy must include objective, constraints, monitoring, and incremental repair. Closing Conclusion One major engineering difference between graph databases and relational databases is that cross-edge communication can directly consume your performance budget.\nOnly by building robust partitioning capability can query performance move from \u0026ldquo;occasionally usable\u0026rdquo; to \u0026ldquo;predictably stable.\u0026rdquo;\nReferences and Further Reading METIS docs and paper: Karypis \u0026amp; Kumar, Multilevel k-way Partitioning Scheme PowerGraph (classic Vertex-cut engineering practice) Pregel / Giraph distributed graph computation models Neo4j / JanusGraph sharding and query practice materials Multi-Language Reference Implementations (Excerpt) C++: Compute Edge-cut #include \u0026lt;vector\u0026gt; #include \u0026lt;utility\u0026gt; int edgeCut(const std::vector\u0026lt;std::pair\u0026lt;int, int\u0026gt;\u0026gt;\u0026amp; edges, const std::vector\u0026lt;int\u0026gt;\u0026amp; part) { int cut = 0; for (const auto\u0026amp; edge : edges) { int u = edge.first; int v = edge.second; if (part[u] != part[v]) { cut += 1; } } return cut; } Go: Compute Partition Load package main func partitionSizes(part []int, k int) []int { sizes := make([]int, k) for _, partition := range part { sizes[partition]++ } return sizes } JavaScript: Compute Replication Factor function replicationFactor(edgeParts, nodeCount) { const nodeToParts = Array.from({ length: nodeCount }, () =\u0026gt; new Set()); for (const item of edgeParts) { const [u, v, p] = item; nodeToParts[u].add(p); nodeToParts[v].add(p); } let total = 0; for (const parts of nodeToParts) total += Math.max(parts.size, 1); return total / nodeCount; } Meta Information Reading time: 18-22 minutes Tags: graph partitioning, Edge-cut, Vertex-cut, METIS SEO keywords: Graph Partitioning, Edge-cut, Vertex-cut, METIS, Query Latency Meta description: How graph partitioning affects query latency and network communication, with runnable examples and an engineering tuning path. Call to Action (CTA) Choose one of your slowest online query templates, measure cross-partition hops and network bytes, then compare p95/p99 before and after one partition optimization pass.\nYou will quickly see: partition strategy is the core lever in graph-database performance engineering.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/100-graph-partitioning-edge-cut-vertex-cut-metis/","summary":"Starting from Edge-cut/Vertex-cut objective functions, this article systematically explains METIS-style multilevel partitioning and production implementation, with emphasis on how partitioning affects query latency and cross-machine traffic.","title":"Graph Partitioning Algorithms: Edge-cut vs Vertex-cut and an Engineering Guide to METIS"},{"content":" Subtitle / Abstract\nIn dynamic-graph workloads, the real pain point is not \u0026ldquo;do you know the algorithm,\u0026rdquo; but \u0026ldquo;can the system survive continuous updates.\u0026rdquo; Following the ACERS template, this article explains three engineering essentials: incremental shortest path, incremental PageRank, and connectivity maintenance, along with three practical strategies: local recomputation, lazy updates, and approximate results.\nEstimated reading time: 14-18 minutes Tags: dynamic graph, incremental computation, shortest path, PageRank, connectivity maintenance SEO keywords: dynamic graph, incremental shortest path, incremental PageRank, connectivity maintenance, local recomputation, lazy updates, approximate results Meta description: An engineering guide to dynamic graphs: how to control latency and cost in high-frequency update scenarios with incremental algorithms and practical system strategies. Target Audience Engineers building online services for graph databases, relationship graphs, and recommendation graphs Developers moving from offline graph computation to real-time incremental computation Tech leads who want to replace \u0026ldquo;full recomputation\u0026rdquo; with a production-ready update pipeline Background / Motivation Static graph algorithms look elegant in papers, but real production graphs are constantly changing:\nUser relations are added/removed Transaction edges continuously stream in Content graphs and knowledge graphs are continuously updated This is where 80% of engineering pain comes from:\nFull recomputation is too slow to keep up with update velocity Strong online consistency is too expensive and blows up P99 latency The business only needs \u0026ldquo;usable approximation\u0026rdquo; but teams implement \u0026ldquo;expensive exactness\u0026rdquo; So the core question becomes:\nIt is not how to compute an answer once, but how to keep computing under an update stream.\nCore Concepts Concept Meaning Engineering Focus Incremental shortest path After edge/node updates, repair only affected regions Impact-domain detection, local recomputation Incremental PageRank Local residual propagation after graph updates Residual threshold, batch window Connectivity maintenance Dynamically maintain connectivity / component changes Fast insertion, hard deletion Local recomputation Recompute only affected subgraphs Lower CPU/memory cost Lazy updates Merge updates into batches for unified processing Throughput first, controllable latency Approximate results Trade error bounds for compute cost SLA vs precision balance A — Algorithm (Problem and Algorithm) Problem Restatement (Engineering Form) Given a continuously updated graph G_t=(V_t,E_t) and an operation stream:\nadd_edge(u,v,w) remove_edge(u,v) query_shortest_path(s,t) query_pagerank_topk(k) query_connected(u,v) Maintain query results at low cost under sustained updates.\nInputs and Outputs Name Type Description graph adjacency list / CSR Graph structure updates update stream Edge insertions, deletions, weight changes queries query stream Path, ranking, connectivity return query result Path distance / ranking / boolean connectivity Example 1: Incremental Shortest Path Initial: A-\u0026gt;B(1), B-\u0026gt;C(1), A-\u0026gt;C(5) Shortest path A-\u0026gt;C = 2 Update: A-\u0026gt;C weight drops to 1 Only local repair in A/C neighborhood is needed, shortest path becomes 1 Example 2: Connectivity Update The graph has two components G1, G2 Add edge x(G1)-y(G2) The connectivity structure should quickly reflect \u0026#34;component merge\u0026#34; Deriving the Approach (From Full to Incremental) Naive Approach: Full Recompute After Every Update Shortest path: full-graph Dijkstra / APSP PageRank: iterate on full graph until convergence Connectivity: full-graph BFS/DFS relabeling Problem: cost explodes under frequent updates.\nKey Observations Most updates only affect local subgraphs Queries usually tolerate short eventual-consistency windows Ranking/recommendation systems often accept controlled error Method Selection Local recomputation: prioritize shrinking the affected region Lazy updates: merge high-frequency small updates into batches Approximate results: set an error threshold to trade for throughput C — Concepts (Core Ideas) 1) Incremental Shortest Path For edge insertion/weight decrease: trigger local relaxation from affected endpoints For edge deletion/weight increase: detect invalid shortest paths and rebuild local trees (harder) Common engineering practice:\nProcess \u0026ldquo;shortening\u0026rdquo; updates online Route \u0026ldquo;lengthening/deletion\u0026rdquo; updates into an async repair queue 2) Incremental PageRank Maintain both rank and residual On edge updates, propagate residual only around affected nodes Stop propagation when residual is below threshold 3) Connectivity Maintenance Insert-only edges: Union-Find is very efficient With edge deletions: needs more complex dynamic connectivity structures; in practice teams often use a compromise of \u0026ldquo;hierarchical rebuild + batch processing\u0026rdquo; Real-World Conclusion (Core) Most production systems do not do \u0026ldquo;fully exact full recomputation on every update.\u0026rdquo;\nA typical solution is: local recomputation + lazy updates + approximate results.\nPractical Guide / Steps Step 1: Separate Update and Query Paths Queries read from \u0026ldquo;published snapshots\u0026rdquo; Updates are appended to an \u0026ldquo;incremental log\u0026rdquo; and applied asynchronously Step 2: Define the Affected Region Shortest path: radius expansion from updated edge endpoints as seeds PageRank: residual propagation from updated nodes Connectivity: record affected components and calibrate asynchronously Step 3: Runnable Python Skeleton from collections import defaultdict, deque import heapq class DynamicGraphEngine: def __init__(self): self.g = defaultdict(dict) # g[u][v] = w self.pending = deque() # update log def add_edge(self, u, v, w=1.0): self.pending.append((\u0026#34;add\u0026#34;, u, v, w)) def remove_edge(self, u, v): self.pending.append((\u0026#34;del\u0026#34;, u, v, None)) def flush_updates(self, budget=1000): \u0026#34;\u0026#34;\u0026#34;Deferred updates: apply in batches under a budget cap.\u0026#34;\u0026#34;\u0026#34; cnt = 0 while self.pending and cnt \u0026lt; budget: op, u, v, w = self.pending.popleft() if op == \u0026#34;add\u0026#34;: self.g[u][v] = w else: self.g[u].pop(v, None) cnt += 1 def shortest_path_local(self, s, t, max_hops=8): \u0026#34;\u0026#34;\u0026#34;Local recomputation example: bound expansion depth/state size.\u0026#34;\u0026#34;\u0026#34; pq = [(0.0, 0, s)] # dist, hops, node dist = {s: 0.0} while pq: d, h, u = heapq.heappop(pq) if u == t: return d if h \u0026gt;= max_hops: continue if d != dist.get(u): continue for v, w in self.g[u].items(): nd = d + w if nd \u0026lt; dist.get(v, float(\u0026#34;inf\u0026#34;)): dist[v] = nd heapq.heappush(pq, (nd, h + 1, v)) return float(\u0026#34;inf\u0026#34;) if __name__ == \u0026#34;__main__\u0026#34;: eng = DynamicGraphEngine() eng.add_edge(\u0026#34;A\u0026#34;, \u0026#34;B\u0026#34;, 1) eng.add_edge(\u0026#34;B\u0026#34;, \u0026#34;C\u0026#34;, 1) eng.add_edge(\u0026#34;A\u0026#34;, \u0026#34;C\u0026#34;, 5) eng.flush_updates() print(eng.shortest_path_local(\u0026#34;A\u0026#34;, \u0026#34;C\u0026#34;)) # 2 eng.add_edge(\u0026#34;A\u0026#34;, \u0026#34;C\u0026#34;, 1) eng.flush_updates() print(eng.shortest_path_local(\u0026#34;A\u0026#34;, \u0026#34;C\u0026#34;)) # 1 E — Engineering (Applications) Scenario 1: Online Shortest-Chain Query in Social Graphs Background: The user relationship graph changes continuously; query \u0026ldquo;the shortest relationship chain between you and someone.\u0026rdquo;\nWhy it fits: Shortest-path updates are strongly local, so local recomputation plus depth capping works well.\n// Pseudocode: run bidirectional BFS only within maxDepth at query time. // Return an approximate hop count online; complete exact path asynchronously. Scenario 2: Incremental PageRank on Recommendation Graphs Background: Content edges and click edges keep changing; rankings must refresh continuously.\nWhy it fits: Incremental PageRank propagates only affected residual and avoids full iterations.\n# Core idea: inject residual to updated nodes, then push locally to epsilon. # Stop propagating when residual \u0026lt; epsilon. Scenario 3: Connectivity Alerting in Transaction Graphs Background: New transaction edges continuously arrive, and the system must quickly detect whether suspicious groups are connected.\nWhy it fits: Use Union-Find for fast union on insertions; put deletions into a lazy verification queue.\nclass DSU { constructor(n) { this.p = Array.from({length:n}, (_,i)=\u0026gt;i); } find(x){ return this.p[x]===x?x:(this.p[x]=this.find(this.p[x])); } union(a,b){ this.p[this.find(a)] = this.find(b); } connected(a,b){ return this.find(a)===this.find(b); } } R — Reflection (Deeper Thinking) Complexity and Cost Module Full recomputation Incremental strategy Shortest path High (full graph) Medium (affected region) PageRank High (multi-round full-graph iteration) Medium (local residual push) Connectivity Medium-high (deletions are hard) Low for insertions, deletions need compromise Alternative Strategies Strong-consistency full recomputation\nPros: exact results Cons: low throughput, high cost Weak-consistency incremental + async repair (mainstream)\nPros: stable online performance Cons: approximate error exists in short windows Pure online approximation + periodic full correction\nPros: strong real-time behavior Cons: requires error monitoring and backfill mechanisms Why This Is the Most Practical Engineering Path Naturally compatible with update streams Keeps latency and cost within budget Supports gradual evolution from \u0026ldquo;usable\u0026rdquo; to \u0026ldquo;more precise\u0026rdquo; Explanation and Principles (Why This Works) In dynamic graphs, algorithm problems often degrade into system problems:\nYou cannot stop updates from arriving You cannot perform perfect recomputation every time You must make explainable trade-offs among correctness, latency, and cost So \u0026ldquo;local recomputation, lazy updates, and approximate results\u0026rdquo; is not a temporary workaround, but a primary design principle.\nFAQ and Caveats When is full recomputation mandatory?\nWhen accumulated error exceeds threshold, or a critical business window requires high precision.\nWhy are edge deletions always harder?\nBecause they can invalidate existing optimal structures and require rollback/rebuild.\nHow do we explain approximate results to the business?\nClearly define error bounds and refresh periods, and provide an eventual-consistency commitment.\nHow do we avoid update storms overwhelming the system?\nSet batch windows, backpressure policies, and query degradation paths.\nBest Practices and Recommendations Define SLA first, then choose exact vs approximate strategy Decouple updates and queries: logged increments + snapshot serving Maintain a \u0026ldquo;recompute budget\u0026rdquo; per algorithm: time, node count, error threshold Must-have observability: update backlog, recomputation hit rate, error drift S — Summary Core Takeaways The real challenge in dynamic graph engineering is the update stream, not a single query Incremental shortest path, incremental PageRank, and connectivity maintenance are the three foundational capabilities Local recomputation, lazy updates, and approximate results are mainstream production strategies Insertions are usually easier; deletions need async repair mechanisms Metrics monitoring and error governance are the lifeline of stable incremental systems Recommended Further Reading Dynamic Graph Algorithms (survey) Bahmani et al. Incremental PageRank at scale Holm, de Lichtenberg, Thorup (dynamic connectivity) Meta Information Reading time: 14-18 minutes Tags: dynamic graph, incremental computation, shortest path, PageRank, connectivity maintenance SEO keywords: dynamic graph, incremental shortest path, incremental PageRank, connectivity maintenance, local recomputation Meta description: Engineering guide to dynamic graph incremental computation: core algorithms, implementation strategies, and production trade-offs. Call to Action (CTA) Two practical next steps:\nSplit your current graph query service into \u0026ldquo;query snapshots + incremental update pipeline\u0026rdquo; Launch approximate mode first with error monitoring, then gradually increase precision If you want, the next article can provide a practical template for \u0026ldquo;error budgets and backfill strategy (SLA-driven).\u0026rdquo;\nMulti-Language Reference Implementations (Python / C / C++ / Go / Rust / JS) # incremental shortest path (bounded local recompute) - simplified import heapq def local_dijkstra(graph, s, t, max_nodes=1000): pq = [(0, s)] dist = {s: 0} seen = 0 while pq and seen \u0026lt; max_nodes: d, u = heapq.heappop(pq) if d != dist.get(u): continue seen += 1 if u == t: return d for v, w in graph.get(u, []): nd = d + w if nd \u0026lt; dist.get(v, 10**18): dist[v] = nd heapq.heappush(pq, (nd, v)) return float(\u0026#34;inf\u0026#34;) /* union-find for dynamic connectivity (insert-only fast path) */ #include \u0026lt;stdio.h\u0026gt; int p[1000]; int find(int x){ return p[x]==x?x:(p[x]=find(p[x])); } void uni(int a,int b){ p[find(a)] = find(b); } int main(){ for(int i=0;i\u0026lt;10;i++) p[i]=i; uni(1,2); uni(2,3); printf(\u0026#34;%d\\n\u0026#34;, find(1)==find(3)); // 1 return 0; } #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; struct DSU { vector\u0026lt;int\u0026gt; p; DSU(int n): p(n) { iota(p.begin(), p.end(), 0); } int find(int x){ return p[x]==x?x:p[x]=find(p[x]); } void unite(int a,int b){ p[find(a)] = find(b); } bool conn(int a,int b){ return find(a)==find(b); } }; int main(){ DSU d(6); d.unite(0,1); d.unite(1,2); cout \u0026lt;\u0026lt; d.conn(0,2) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; // 1 } package main import \u0026#34;fmt\u0026#34; type DSU struct{ p []int } func NewDSU(n int) *DSU { d:=\u0026amp;DSU{make([]int,n)}; for i:=0;i\u0026lt;n;i++{d.p[i]=i}; return d } func (d *DSU) Find(x int) int { if d.p[x]!=x { d.p[x]=d.Find(d.p[x]) }; return d.p[x] } func (d *DSU) Union(a,b int){ d.p[d.Find(a)] = d.Find(b) } func main(){ d := NewDSU(6) d.Union(1,2); d.Union(2,3) fmt.Println(d.Find(1)==d.Find(3)) // true } struct DSU { p: Vec\u0026lt;usize\u0026gt; } impl DSU { fn new(n: usize) -\u0026gt; Self { Self { p: (0..n).collect() } } fn find(\u0026amp;mut self, x: usize) -\u0026gt; usize { if self.p[x] != x { let r = self.find(self.p[x]); self.p[x] = r; } self.p[x] } fn union(\u0026amp;mut self, a: usize, b: usize) { let ra = self.find(a); let rb = self.find(b); self.p[ra] = rb; } } fn main() { let mut d = DSU::new(5); d.union(0, 1); d.union(1, 2); println!(\u0026#34;{}\u0026#34;, d.find(0) == d.find(2)); } // lazy update queue skeleton const pending = []; function addEdge(u, v, w) { pending.push({ op: \u0026#34;add\u0026#34;, u, v, w }); } function flush(graph, budget = 100) { let cnt = 0; while (pending.length \u0026amp;\u0026amp; cnt \u0026lt; budget) { const e = pending.shift(); if (e.op === \u0026#34;add\u0026#34;) { if (!graph.has(e.u)) graph.set(e.u, []); graph.get(e.u).push([e.v, e.w]); } cnt += 1; } } ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/90-dynamic-graph-incremental-computation/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nIn dynamic-graph workloads, the real pain point is not \u0026ldquo;do you know the algorithm,\u0026rdquo; but \u0026ldquo;can the system survive continuous updates.\u0026rdquo; Following the ACERS template, this article explains three engineering essentials: \u003cstrong\u003eincremental shortest path, incremental PageRank, and connectivity maintenance\u003c/strong\u003e, along with three practical strategies: \u003cstrong\u003elocal recomputation, lazy updates, and approximate results\u003c/strong\u003e.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 14-18 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003edynamic graph\u003c/code\u003e, \u003ccode\u003eincremental computation\u003c/code\u003e, \u003ccode\u003eshortest path\u003c/code\u003e, \u003ccode\u003ePageRank\u003c/code\u003e, \u003ccode\u003econnectivity maintenance\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: dynamic graph, incremental shortest path, incremental PageRank, connectivity maintenance, local recomputation, lazy updates, approximate results\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: An engineering guide to dynamic graphs: how to control latency and cost in high-frequency update scenarios with incremental algorithms and practical system strategies.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eEngineers building online services for graph databases, relationship graphs, and recommendation graphs\u003c/li\u003e\n\u003cli\u003eDevelopers moving from offline graph computation to real-time incremental computation\u003c/li\u003e\n\u003cli\u003eTech leads who want to replace \u0026ldquo;full recomputation\u0026rdquo; with a production-ready update pipeline\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eStatic graph algorithms look elegant in papers, but real production graphs are constantly changing:\u003c/p\u003e","title":"Dynamic Graphs and Incremental Computation: ACERS Guide to Incremental Shortest Path, Incremental PageRank, and Connectivity Maintenance"},{"content":" Subtitle / Abstract\nCommunity detection is not just \u0026ldquo;splitting a graph into a few groups.\u0026rdquo; In production, you must balance accuracy, interpretability, speed, and maintainability. Following the ACERS structure, this article breaks down two of the most common engineering choices: Louvain (modularity optimization) and Label Propagation (LPA).\nEstimated reading time: 12-16 minutes Tags: Community Detection, Louvain, Label Propagation, Graph Partitioning SEO keywords: community detection, Louvain, Label Propagation, modularity, graph partition Meta description: Engineering primer for community detection: principles, complexity, algorithm selection, and implementation templates for Louvain and LPA across group discovery, graph partitioning, and cold start. Target Audience Engineers working on social graphs, risk-control graphs, or recommender-system graph analytics Developers who want to move community detection from paper concepts into production workflows Practitioners modeling group structure for graph partitioning and cold-start scenarios Background / Motivation Community detection appears frequently in production:\nGroup identification: detect strongly related account clusters, suspicious groups, or interest circles Graph partitioning: place tightly connected subgraphs on the same shard to reduce cross-shard traffic Cold-start analysis: quickly classify new users/entities via neighborhood community structure The pain points are:\nGlobal optimum is usually unattainable (related objectives are NP-hard) Data is large and fast-changing, so offline algorithms are hard to rerun frequently Different products prioritize stability, speed, and interpretability differently So in practice, two methods dominate:\nLouvain: optimize for higher-quality communities (modularity) Label Propagation (LPA): optimize for speed and implementation simplicity Core Concepts Concept Meaning Engineering Impact Community a node set dense inside and sparse outside impacts partition and recommendation quality Modularity (Q) metric for partition quality Louvain optimization target Label Propagation nodes iteratively adopt majority neighbor labels fast but stochastic Graph Partition split storage/compute by community reduces cross-machine communication cost Cold Start quickly assign new nodes by neighborhood structure improves early-stage recall A - Algorithm (Problem and Algorithm) Problem Restatement (Engineering Abstraction) Given an undirected graph G=(V,E), output a community ID for each node, supporting:\nGroup identification (community membership output) Graph partitioning (map communities to shards) Cold-start assignment (map new nodes to candidate communities) Input / Output Name Type Description graph Dict[int, Set[int]] adjacency list (undirected graph) return Dict[int, int] node-to-community label mapping Example 1 0-1-2 form one cluster, 3-4-5 form another, with a weak edge between 2 and 3 Possible output: {0,1,2} -\u0026gt; C1, {3,4,5} -\u0026gt; C2 Example 2 Star graph (one center connected to multiple leaves) LPA often merges the center with most leaves into one community Reasoning Path (From Naive to Practical) Naive Approach: Connected Components Partition only by connectivity Cannot express cases where a weak bridge should separate two groups Key Observations Community is not just \u0026ldquo;connected\u0026rdquo;; it means \u0026ldquo;internally denser\u0026rdquo; Global optimum is unrealistic; scalable heuristics are preferred in production Different tasks require different priorities: Quality-first: Louvain Latency-first: LPA Method Selection Louvain: modularity-driven, typically more stable quality LPA: lightest implementation, suitable for very large graph coarse clustering C - Concepts (Core Ideas) Louvain: Modularity Maximization A common modularity form:\n$$ Q=\\frac{1}{2m}\\sum_{ij}\\left(A_{ij}-\\frac{k_i k_j}{2m}\\right)\\delta(c_i,c_j) $$\nWhere:\nA_ij: adjacency matrix entry k_i: degree of node i m: number of edges delta(c_i,c_j): 1 if same community, else 0 Louvain runs in a two-phase loop:\nLocal move: try moving each node into a neighbor community; accept when dQ \u0026gt; 0 Community aggregation: collapse communities into super-nodes and repeat Label Propagation: Neighbor Majority Voting Start with one label per node and iterate:\nNew label = most frequent neighbor label (ties broken randomly or by rule) Stop at convergence or max iterations Pros:\nSimple and fast Cons:\nResults depend on update order and randomness Stability is usually weaker than Louvain Minimal Mental Model Louvain: explicit objective optimization (Q) LPA: local consistency diffusion (majority label) Practical Guide / Steps Define your objective first: quality-first or latency-first Run Louvain on small-to-medium data first to build a quality baseline For large-scale online systems, run LPA coarse grouping first, then business post-processing Fix random seeds and record versions for reproducibility For cold start, use \u0026ldquo;neighbor label vote + confidence threshold\u0026rdquo; Runnable Python example (python3 community_demo.py):\nfrom collections import Counter import random def label_propagation(graph, max_iter=20, seed=42): random.seed(seed) label = {u: u for u in graph} nodes = list(graph.keys()) for _ in range(max_iter): changed = 0 random.shuffle(nodes) for u in nodes: if not graph[u]: continue cnt = Counter(label[v] for v in graph[u]) best = max(cnt.items(), key=lambda x: (x[1], -x[0]))[0] if label[u] != best: label[u] = best changed += 1 if changed == 0: break return label def cold_start_assign(graph, labels, new_neighbors): # new_neighbors: known neighbor list of the new node cnt = Counter(labels[v] for v in new_neighbors if v in labels) if not cnt: return None return cnt.most_common(1)[0][0] if __name__ == \u0026#34;__main__\u0026#34;: graph = { 0: {1, 2}, 1: {0, 2}, 2: {0, 1, 3}, 3: {2, 4, 5}, 4: {3, 5}, 5: {3, 4}, } labels = label_propagation(graph) print(\u0026#34;labels:\u0026#34;, labels) print(\u0026#34;new node -\u0026gt;\u0026#34;, cold_start_assign(graph, labels, [0, 2])) E - Engineering (Engineering Applications) Scenario 1: Group Identification (Python) Background: detect tightly connected groups in social or transaction graphs.\nWhy this fits: both Louvain and LPA quickly generate community labels that feed risk rules and visual analytics.\ndef group_by_label(labels): out = {} for u, c in labels.items(): out.setdefault(c, []).append(u) return out Scenario 2: Graph Partition Mapping (Go) Background: when graph storage is sharded, you want nodes in the same community to land in the same partition as much as possible.\nWhy this fits: community labels can be converted directly into partition keys, reducing cross-shard edge lookups.\npackage main import \u0026#34;fmt\u0026#34; func partitionByCommunity(labels map[int]int, shardCount int) map[int]int { part := make(map[int]int) for node, comm := range labels { part[node] = comm % shardCount } return part } func main() { labels := map[int]int{0: 1, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2} fmt.Println(partitionByCommunity(labels, 4)) } Scenario 3: Cold-start Community Assignment (JavaScript) Background: a new user node has little history but a small set of neighbor links.\nWhy this fits: voting by neighbor communities provides a fast initial group for recommendation/recall pipelines.\nfunction assignCommunity(labels, neighbors) { const cnt = new Map(); for (const v of neighbors) { if (labels[v] === undefined) continue; cnt.set(labels[v], (cnt.get(labels[v]) || 0) + 1); } let best = null; let bestCnt = -1; for (const [c, n] of cnt.entries()) { if (n \u0026gt; bestCnt) { bestCnt = n; best = c; } } return best; } console.log(assignCommunity({0: 1, 2: 1, 3: 2}, [0, 2, 3])); R - Reflection (Reflection and Deep Dive) Complexity (Engineering View) LPA: about O(E) per round, total O(T*E) (T = iteration rounds) Louvain: common implementations are close to multi-round O(E) behavior, but constants depend on data distribution Alternatives and Trade-offs Method Pros Cons Best Fit Louvain usually better community quality more complex implementation, non-trivial incremental maintenance offline analysis, quality-first LPA fast, simple, parallelizable weaker stability very large graphs, real-time coarse clustering Spectral clustering strong mathematical properties expensive on large graphs fine-grained analysis on small/medium graphs Common Mistakes Looking only at algorithm name while ignoring query/update ratio Treating one LPA run as absolute truth without stability evaluation Hard-assigning cold-start nodes without preserving a \u0026ldquo;low-confidence pending\u0026rdquo; state Why This Works in Production Louvain and LPA form a complementary quality-speed pair Community labels directly power group identification, partitioning, and cold start You can run LPA first for approximation, then refine key subgraphs with Louvain Frequently Asked Questions and Notes Is Louvain always better than LPA?\nNot always. Louvain often gives better quality, but in high-throughput real-time settings LPA may be a better choice.\nDo I need to predefine the number of communities?\nUsually no for Louvain/LPA, which is one reason they are engineering-friendly.\nIs pure neighbor voting safe for cold start?\nAdd threshold and fallback logic: when confidence is low, route to an \u0026ldquo;undetermined\u0026rdquo; group first.\nBest Practices and Recommendations Define evaluation metrics first: modularity, business hit rate, stability Fix random seeds and run variance checks across multiple runs Prioritize low-latency methods online, then refine in offline batch Version community labels for rollback, tracing, and gradual rollout S - Summary (Summary) Core Takeaways Community detection is structural-signal modeling, not just graph-clustering visualization Louvain fits quality-first goals; LPA fits speed-first goals Group identification, graph partitioning, and cold start can all reuse community labels directly Production rollout should use a two-stage strategy: fast coarse grouping + focused refinement Metrics and reproducibility (seed/version) matter as much as the algorithm itself Recommended Further Reading Blondel et al., Fast unfolding of communities in large networks (Louvain) Raghavan et al., Near linear time algorithm to detect community structures (LPA) Graph partitioning in distributed graph systems (engineering sharding practice) Metadata Reading time: 12-16 minutes Tags: Community Detection, Louvain, LPA, Graph Partitioning, Cold Start SEO keywords: community detection, Louvain, Label Propagation, graph partition Meta description: Engineering selection and implementation of Louvain vs Label Propagation for group identification, graph partitioning, and cold-start analysis. Call To Action (CTA) A practical next step is to do two things:\nRun Louvain and LPA on your real business graph and compare modularity with business metrics Add \u0026ldquo;community confidence threshold + fallback logic\u0026rdquo; to your cold-start strategy and track online conversion changes If you want, I can continue with the next article: \u0026ldquo;Community Detection Evaluation Framework: Beyond Modularity, How to Define Business-usable Clustering Quality.\u0026rdquo;\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from collections import Counter def lpa(graph, rounds=10): label = {u: u for u in graph} for _ in range(rounds): changed = 0 for u in graph: if not graph[u]: continue cnt = Counter(label[v] for v in graph[u]) best = max(cnt, key=cnt.get) if label[u] != best: label[u] = best changed += 1 if changed == 0: break return label #include \u0026lt;stdio.h\u0026gt; // Simplified demo: community-label to partition mapping (not full Louvain/LPA) int main(void) { int labels[] = {1,1,1,2,2,2}; int n = 6, shard = 4; for (int i = 0; i \u0026lt; n; ++i) { printf(\u0026#34;node=%d comm=%d part=%d\\n\u0026#34;, i, labels[i], labels[i] % shard); } return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;unordered_map\u0026gt; #include \u0026lt;vector\u0026gt; std::unordered_map\u0026lt;int, std::vector\u0026lt;int\u0026gt;\u0026gt; groupBy(const std::vector\u0026lt;int\u0026gt;\u0026amp; label) { std::unordered_map\u0026lt;int, std::vector\u0026lt;int\u0026gt;\u0026gt; g; for (int i = 0; i \u0026lt; (int)label.size(); ++i) g[label[i]].push_back(i); return g; } int main() { std::vector\u0026lt;int\u0026gt; label = {1,1,1,2,2,2}; auto g = groupBy(label); for (auto\u0026amp; kv : g) { std::cout \u0026lt;\u0026lt; \u0026#34;comm \u0026#34; \u0026lt;\u0026lt; kv.first \u0026lt;\u0026lt; \u0026#34;: \u0026#34;; for (int u : kv.second) std::cout \u0026lt;\u0026lt; u \u0026lt;\u0026lt; \u0026#34; \u0026#34;; std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } } package main import \u0026#34;fmt\u0026#34; func assignCommunity(labels map[int]int, neighbors []int) (int, bool) { cnt := map[int]int{} for _, v := range neighbors { if c, ok := labels[v]; ok { cnt[c]++ } } best, bestN := 0, 0 for c, n := range cnt { if n \u0026gt; bestN { best, bestN = c, n } } if bestN == 0 { return 0, false } return best, true } func main() { labels := map[int]int{0: 1, 2: 1, 3: 2} comm, ok := assignCommunity(labels, []int{0, 2, 3}) fmt.Println(comm, ok) } use std::collections::HashMap; fn assign_community(labels: \u0026amp;HashMap\u0026lt;i32, i32\u0026gt;, neighbors: \u0026amp;[i32]) -\u0026gt; Option\u0026lt;i32\u0026gt; { let mut cnt: HashMap\u0026lt;i32, i32\u0026gt; = HashMap::new(); for \u0026amp;v in neighbors { if let Some(\u0026amp;c) = labels.get(\u0026amp;v) { *cnt.entry(c).or_insert(0) += 1; } } cnt.into_iter().max_by_key(|(_, n)| *n).map(|(c, _)| c) } fn main() { let mut labels = HashMap::new(); labels.insert(0, 1); labels.insert(2, 1); labels.insert(3, 2); println!(\u0026#34;{:?}\u0026#34;, assign_community(\u0026amp;labels, \u0026amp;[0, 2, 3])); } function assignCommunity(labels, neighbors) { const cnt = new Map(); for (const v of neighbors) { if (labels[v] === undefined) continue; cnt.set(labels[v], (cnt.get(labels[v]) || 0) + 1); } let best = null, bestN = -1; for (const [c, n] of cnt) { if (n \u0026gt; bestN) { best = c; bestN = n; } } return best; } console.log(assignCommunity({0: 1, 2: 1, 3: 2}, [0, 2, 3])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/80-community-detection-louvain-label-propagation/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nCommunity detection is not just \u0026ldquo;splitting a graph into a few groups.\u0026rdquo; In production, you must balance accuracy, interpretability, speed, and maintainability. Following the ACERS structure, this article breaks down two of the most common engineering choices: \u003cstrong\u003eLouvain (modularity optimization)\u003c/strong\u003e and \u003cstrong\u003eLabel Propagation (LPA)\u003c/strong\u003e.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 12-16 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eCommunity Detection\u003c/code\u003e, \u003ccode\u003eLouvain\u003c/code\u003e, \u003ccode\u003eLabel Propagation\u003c/code\u003e, \u003ccode\u003eGraph Partitioning\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: community detection, Louvain, Label Propagation, modularity, graph partition\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Engineering primer for community detection: principles, complexity, algorithm selection, and implementation templates for Louvain and LPA across group discovery, graph partitioning, and cold start.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eEngineers working on social graphs, risk-control graphs, or recommender-system graph analytics\u003c/li\u003e\n\u003cli\u003eDevelopers who want to move community detection from paper concepts into production workflows\u003c/li\u003e\n\u003cli\u003ePractitioners modeling group structure for graph partitioning and cold-start scenarios\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eCommunity detection appears frequently in production:\u003c/p\u003e","title":"Community Detection Primer: Engineering Trade-offs Between Louvain and Label Propagation - ACERS Analysis"},{"content":" Subtitle / Abstract\nSubgraph matching is one of the hardest parts of graph querying: NP-hard in theory, but not automatically \u0026ldquo;too slow\u0026rdquo; in production. Following the ACERS template, this article explains VF2 and Ullmann clearly, and focuses on what actually decides performance: candidate generation and pruning strategy.\nEstimated reading time: 15-20 minutes Tags: Subgraph Matching, VF2, Ullmann, Graph Database SEO keywords: Subgraph Isomorphism, VF2, Ullmann, candidate pruning, graph pattern matching Meta description: Starting from NP-hard subgraph isomorphism, this article explains VF2/Ullmann mechanics and practical pruning strategies for constrained graph-database pattern queries. Target Audience Engineers building pattern queries, rule detection, or risk-relationship mining in graph databases Developers who already know BFS/DFS/connected components and want stronger graph-matching skills Algorithm practitioners balancing explainable rule matching against performance limits Background / Motivation In graph databases, you regularly face requirements like:\nFind a suspicious structure such as \u0026ldquo;one person - two companies - same device\u0026rdquo; Find a specific \u0026ldquo;author - paper - institution\u0026rdquo; pattern Find \u0026ldquo;cyclic laundering templates\u0026rdquo; in transaction chains These queries are essentially Subgraph Isomorphism: given a pattern graph Q, find an embedding in data graph G that satisfies both structure and constraints.\nTheoretically this is NP-hard, so worst-case exponential blowups are unavoidable.\nIn production, however, most queries are constrained patterns (labels, directions, attributes, and small pattern size), so performance usually depends on this:\nShrink candidates aggressively first, then run matching search.\nCore Concepts Subgraph Isomorphism: an injective mapping from pattern nodes to data nodes that preserves edges Constrained pattern: restrictions on label, direction, degree, and attribute predicates Candidate set: possible data nodes for each pattern node Pruning: reject impossible mappings early to reduce backtracking branches VF2: depth-first matching framework using state expansion plus feasibility checks Ullmann: classic method based on candidate matrix and iterative neighborhood consistency refinement A - Algorithm (Problem and Algorithm) Problem Restatement (Engineering Version) Given:\nData graph G=(V_G,E_G) (usually large) Pattern graph Q=(V_Q,E_Q) (usually small) Node/edge constraints (labels, direction, attribute predicates) Goal:\nDecide whether a match exists (existence) Or return all valid mapping results (enumeration) Input / Output Name Type Description G graph data graph, with large ` Q graph pattern graph, with small ` constraints constraints label / degree / attributes / direction, etc. return bool / mappings existence result or mapping list Example 1 (Match Exists) Pattern Q: A -knows-\u0026gt; B -works_at-\u0026gt; C Data G: multiple A/B/C-labeled nodes and directed edges Result: at least one mapping satisfies labels and direction Example 2 (Pruned Away) Pattern Q: node X has degree\u0026gt;=4 and label=Merchant Data G: all Merchant nodes have max degree=2 Result: empty candidate set -\u0026gt; fail immediately (no backtracking needed) Reasoning Path (From Brute Force to Practical) Naive Brute Force Enumerate permutations/combinations of data nodes for the |V_Q| pattern nodes Verify every pattern edge Complexity is roughly exponential and unusable in real systems.\nKey Observations Pattern graphs are usually small, but data graphs can be huge Most candidates can be filtered out by \u0026ldquo;label + degree + neighborhood\u0026rdquo; checks VF2/Ullmann only become effective after the candidate space is reduced Method Choice Theoretical framing: Subgraph Isomorphism is NP-hard Engineering pipeline: candidate generation -\u0026gt; candidate pruning -\u0026gt; backtracking match Implementation choices: both VF2 and Ullmann fit this pipeline C - Concepts (Core Ideas) VF2 Idea (More Common in Practice) Extend partial mapping M step by step At each step, choose one pattern node u and try candidate v Run feasibility checks: Semantic constraints (label / attribute) Topological constraints (edge consistency with already matched neighbors) Frontier consistency (in/out frontier) Backtrack immediately when infeasible Ullmann Idea (Matrix Refinement) Initial candidate matrix C[u][v] means u may map to v Repeatedly apply neighborhood-consistency propagation (refinement) Perform backtracking after matrix contraction Relationship Between Them Ullmann is closer to \u0026ldquo;strong preprocessing first, then search\u0026rdquo; VF2 is closer to \u0026ldquo;search while doing local feasibility checks\u0026rdquo; In production they are often combined: Ullmann-style candidate refinement + VF2-style search Why Candidate Pruning Matters More Search complexity is roughly driven by:\n[ \\prod_{u \\in V_Q} |Cand(u)| ]\nIf |Cand(u)| drops from 100 to 5, search-tree size changes by orders of magnitude even with the same algorithm.\nPractical Guide / Steps Normalize the pattern: fix node order (high-constraint nodes first) Generate candidates: prefilter by label/type/degree Refine candidates: iterative neighborhood consistency (Ullmann-style) Backtracking match: injective mapping + adjacency consistency checks (VF2-style) Early stop: for existence-only queries, return at first match Output control: cap max returned mappings to avoid output explosion Runnable Example (Python) from typing import Dict, List, Set, Tuple class Graph: def __init__(self, n: int): self.n = n self.adj = [set() for _ in range(n)] self.label = [\u0026#34;\u0026#34;] * n def add_edge(self, u: int, v: int) -\u0026gt; None: self.adj[u].add(v) def build_candidates(G: Graph, Q: Graph) -\u0026gt; List[Set[int]]: cands: List[Set[int]] = [] for u in range(Q.n): s = set() for v in range(G.n): # Semantic + degree-lower-bound pruning if Q.label[u] == G.label[v] and len(Q.adj[u]) \u0026lt;= len(G.adj[v]): s.add(v) cands.append(s) return cands def refine_candidates(G: Graph, Q: Graph, cands: List[Set[int]]) -\u0026gt; None: # Ullmann-style neighborhood consistency refinement changed = True while changed: changed = False for u in range(Q.n): remove = [] for v in cands[u]: ok = True for nu in Q.adj[u]: # At least one candidate neighbor can realize edge u-\u0026gt;nu if not any((nv in G.adj[v]) for nv in cands[nu]): ok = False break if not ok: remove.append(v) if remove: changed = True for x in remove: cands[u].remove(x) def has_match_vf2_style(G: Graph, Q: Graph) -\u0026gt; bool: cands = build_candidates(G, Q) refine_candidates(G, Q, cands) if any(len(s) == 0 for s in cands): return False order = sorted(range(Q.n), key=lambda u: len(cands[u])) used_g: Set[int] = set() mapping: Dict[int, int] = {} def feasible(u: int, v: int) -\u0026gt; bool: # Edge consistency check against already matched nodes for qu, gv in mapping.items(): if u in Q.adj[qu] and v not in G.adj[gv]: return False if qu in Q.adj[u] and gv not in G.adj[v]: return False return True def dfs(i: int) -\u0026gt; bool: if i == len(order): return True u = order[i] for v in cands[u]: if v in used_g: continue if not feasible(u, v): continue mapping[u] = v used_g.add(v) if dfs(i + 1): return True # early stop: existence used_g.remove(v) del mapping[u] return False return dfs(0) if __name__ == \u0026#34;__main__\u0026#34;: # Data graph G = Graph(6) G.label = [\u0026#34;A\u0026#34;, \u0026#34;B\u0026#34;, \u0026#34;C\u0026#34;, \u0026#34;A\u0026#34;, \u0026#34;B\u0026#34;, \u0026#34;C\u0026#34;] G.add_edge(0, 1) G.add_edge(1, 2) G.add_edge(3, 4) G.add_edge(4, 5) # Pattern graph A-\u0026gt;B-\u0026gt;C Q = Graph(3) Q.label = [\u0026#34;A\u0026#34;, \u0026#34;B\u0026#34;, \u0026#34;C\u0026#34;] Q.add_edge(0, 1) Q.add_edge(1, 2) print(has_match_vf2_style(G, Q)) # True Run:\npython3 subgraph_match_demo.py E - Engineering (Engineering Applications) Scenario 1: Anti-Fraud Rule Graph Query (Python) Background: detect structured patterns like \u0026ldquo;shared device + multiple accounts + money returning flow\u0026rdquo;.\nWhy this fits: pattern size is small and constraints are strong, so pruning keeps queries controllable.\ndef is_suspicious(match_count: int, threshold: int = 1) -\u0026gt; bool: return match_count \u0026gt;= threshold print(is_suspicious(2, 1)) Scenario 2: Knowledge Graph Template Retrieval (Go) Background: retrieve patterns such as \u0026ldquo;author-paper-institution\u0026rdquo; or \u0026ldquo;drug-target-disease\u0026rdquo;.\nWhy this fits: strong label constraints allow candidate shrinking early.\npackage main import \u0026#34;fmt\u0026#34; func estimateSearchSpace(cands []int) int { space := 1 for _, x := range cands { space *= x } return space } func main() { fmt.Println(estimateSearchSpace([]int{3, 5, 2})) // 30 } Scenario 3: Template Routing Before Graph Sharding (JavaScript) Background: in multi-shard graph storage, quickly estimate which shards a pattern likely touches first.\nWhy this fits: candidate-shard pruning can reduce cross-shard RPC calls.\nfunction shardHint(candidateNodes, shardCount) { const hit = new Set(candidateNodes.map((x) =\u0026gt; x % shardCount)); return [...hit]; } console.log(shardHint([12, 18, 25, 31], 4)); R - Reflection (Reflection and Deep Dive) Complexity Analysis Worst-case subgraph isomorphism complexity is exponential (NP-hard) Real runtime is dominated by search-tree size Candidate pruning quality directly decides practical feasibility Alternatives and Trade-offs Approach Strength Limitation Brute-force enumeration Simple to implement Barely scalable Ullmann Strong preprocessing pruning, clear logic Matrix operations can be costly VF2 Widely adopted in engineering, efficient local checks Sensitive to candidate quality Native graph DB pattern engine Easier operations and integration More black-box behavior, tuning is experience-heavy Why \u0026ldquo;Candidate Pruning First\u0026rdquo; In production, most queries are constrained patterns (label + direction + attributes).\nThat means the bottleneck is usually the candidate stage, not \u0026ldquo;VF2 vs Ullmann\u0026rdquo; by itself.\nExplanation and Principles (Why This Works) Subgraph matching can be split into two layers:\nSemantic filtering: remove clearly impossible nodes first Structural validation: run isomorphism search in the reduced candidate space This layering often turns NP-hard matching into acceptable production latency for business queries.\nFrequently Asked Questions and Notes Is a smaller pattern always faster?\nNot necessarily. If constraints are weak (for example wildcard labels), even a small pattern can have huge candidate sets.\nCan I run VF2 without candidate filtering?\nYou can, but it is usually too slow on large graphs.\nWhat if result size explodes?\nYou must enforce max return limits and support existence-only mode.\nWhere should attribute predicates be applied?\nPush them as early as possible into candidate generation to reduce backtracking branches.\nBest Practices and Recommendations Match pattern nodes in ascending candidate-set size Push label/direction/degree/attribute filters up front Provide both limit and timeout in online APIs Separate metrics into candidate size, pruning rate, and backtracking depth for performance diagnosis S - Summary (Summary) Core Takeaways Subgraph Isomorphism is NP-hard in theory, but still practical in engineering contexts. VF2 and Ullmann both reduce to \u0026ldquo;constraint-driven search + pruning.\u0026rdquo; Constrained patterns are the mainstream query shape; performance hinges on candidate shrinking. Candidate pruning usually impacts throughput more than the specific classic algorithm name. Splitting query goals into existence / top-k / full enumeration significantly improves system stability. Recommended Further Reading Cordella et al. A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs (VF2) Ullmann. An Algorithm for Subgraph Isomorphism Pattern-matching and query-optimization documentation from Neo4j / TigerGraph Closing Conclusion The real engineering skill in subgraph matching is not memorizing VF2 or Ullmann names; it is converting business constraints into strong pruning rules.\nWhen you compress the candidate space, even NP-hard matching can run inside production latency budgets.\nMetadata Reading time: 15-20 minutes Tags: Subgraph Matching, VF2, Ullmann, Graph Database, Pruning SEO keywords: Subgraph Isomorphism, VF2, Ullmann, candidate pruning Meta description: Practical subgraph matching engineering with VF2/Ullmann ideas and candidate-pruning-first strategy. Call To Action (CTA) A practical next step is to do two things now:\nMeasure \u0026ldquo;candidate size distribution\u0026rdquo; and \u0026ldquo;pruning rate\u0026rdquo; for existing pattern queries. Split out an existence-only query path and use early stop to reduce latency. If you want, I can continue with \u0026ldquo;9) Graph Indexing (Neighborhood Signature / Path Index)\u0026rdquo; as a direct follow-up to this article.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) # existence-only subgraph match skeleton def has_match(candidates, feasible): order = sorted(range(len(candidates)), key=lambda i: len(candidates[i])) used = set() map_q2g = {} def dfs(i): if i == len(order): return True u = order[i] for v in candidates[u]: if v in used: continue if not feasible(u, v, map_q2g): continue used.add(v) map_q2g[u] = v if dfs(i + 1): return True used.remove(v) del map_q2g[u] return False return dfs(0) #include \u0026lt;stdio.h\u0026gt; int main(void) { // C version: the key signal is prune first, backtrack later int candidate_size_q0 = 3; int candidate_size_q1 = 5; int search_space_upper = candidate_size_q0 * candidate_size_q1; printf(\u0026#34;upper search space = %d\\n\u0026#34;, search_space_upper); return 0; } #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; int main() { vector\u0026lt;int\u0026gt; cand = {3, 4, 2}; long long upper = 1; for (int x : cand) upper *= x; cout \u0026lt;\u0026lt; \u0026#34;upper=\u0026#34; \u0026lt;\u0026lt; upper \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } package main import \u0026#34;fmt\u0026#34; func upperBound(cands []int) int { ans := 1 for _, x := range cands { ans *= x } return ans } func main() { fmt.Println(upperBound([]int{3, 4, 2})) } fn upper_bound(cands: \u0026amp;[usize]) -\u0026gt; usize { cands.iter().product() } fn main() { let cands = vec![3, 4, 2]; println!(\u0026#34;{}\u0026#34;, upper_bound(\u0026amp;cands)); } function upperBound(cands) { return cands.reduce((acc, x) =\u0026gt; acc * x, 1); } console.log(upperBound([3, 4, 2])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/70-subgraph-matching-vf2-ullmann-and-pruning/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nSubgraph matching is one of the hardest parts of graph querying: NP-hard in theory, but not automatically \u0026ldquo;too slow\u0026rdquo; in production. Following the ACERS template, this article explains VF2 and Ullmann clearly, and focuses on what actually decides performance: \u003cstrong\u003ecandidate generation and pruning strategy\u003c/strong\u003e.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 15-20 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eSubgraph Matching\u003c/code\u003e, \u003ccode\u003eVF2\u003c/code\u003e, \u003ccode\u003eUllmann\u003c/code\u003e, \u003ccode\u003eGraph Database\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Subgraph Isomorphism, VF2, Ullmann, candidate pruning, graph pattern matching\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Starting from NP-hard subgraph isomorphism, this article explains VF2/Ullmann mechanics and practical pruning strategies for constrained graph-database pattern queries.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eEngineers building pattern queries, rule detection, or risk-relationship mining in graph databases\u003c/li\u003e\n\u003cli\u003eDevelopers who already know BFS/DFS/connected components and want stronger graph-matching skills\u003c/li\u003e\n\u003cli\u003eAlgorithm practitioners balancing explainable rule matching against performance limits\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eIn graph databases, you regularly face requirements like:\u003c/p\u003e","title":"Subgraph Matching / Pattern Matching: VF2, Ullmann, and Engineering-Grade Pruning - ACERS Analysis"},{"content":" Subtitle / Abstract\nCentrality is not just a paper concept. In graph systems, it is a practical node-importance ranking engine. This article follows the ACERS structure to explain Degree / Betweenness / Closeness and gives one pragmatic conclusion: most online systems reliably support only Degree + approximate Betweenness.\nEstimated reading time: 12-16 minutes Tags: Graph Theory, Centrality, Degree, Betweenness, Closeness SEO keywords: graph centrality, Degree Centrality, Betweenness, Closeness, approximate Betweenness Meta description: Engineering guide to graph centrality: definitions, complexity, approximation methods, and production strategies, with runnable code. Target Audience Engineers working on relationship graph analysis, knowledge graphs, or graph-database query optimization Developers who need to turn \u0026ldquo;node importance\u0026rdquo; from concept into production metric Practitioners who want to understand why Betweenness is expensive in production and how to approximate it Background / Motivation In graph systems, you will eventually face questions like these:\nWhich nodes are social influencers or transaction hubs? Which nodes are key bridges whose removal significantly fragments the graph? Which nodes are globally closer to others and better suited as entry points or cache hotspots? These map directly to centrality metrics:\nDegree Centrality: how many connections a node has (local importance) Betweenness Centrality: whether a node lies on many shortest paths (bridge importance) Closeness Centrality: whether a node has shorter average distance to the full graph (global proximity) In practice, the biggest challenge is not definition - it is compute cost:\nDegree is very cheap and almost always supports real-time use Exact Betweenness is expensive and is usually offline or approximate Closeness requires many shortest-path computations and quickly becomes hard to run online on large graphs Core Concepts 1) Degree Centrality For node v in an undirected graph, Degree centrality is commonly written as:\nC_D(v) = deg(v) / (n - 1) Meaning: local connectivity activity of the node.\n2) Betweenness Centrality C_B(v) = Σ_{s≠v≠t} (σ_st(v) / σ_st) σ_st: number of shortest paths from s to t σ_st(v): number of those shortest paths that pass through v Meaning: mediation power of the node as a channel or bridge.\n3) Closeness Centrality C_C(v) = (n - 1) / Σ_{u≠v} d(v, u) Meaning: how close the node is, overall, to all other nodes in the graph.\nPractical note: disconnected graphs often use harmonic closeness to avoid denominator issues caused by unreachable nodes.\nA — Algorithm (Problem and Algorithm) Problem Restatement (Engineering Version) Given graph G=(V,E), compute centrality scores for each node and return Top-K nodes:\nDegree centrality Betweenness centrality (approximation allowed) Closeness centrality (or harmonic variant) Input/Output Name Type Description graph adjacency list graph[u] = [v1, v2, ...] (unweighted) k int number of Top-K outputs mode str degree / betweenness / closeness return List[(node, score)] sorted node scores Example 1 (Small Graph) A-B-C-D and B-E Intuition: - B has high degree -\u0026gt; high Degree - B/C lie on many shortest paths -\u0026gt; high Betweenness - B/C are closer on average to other nodes -\u0026gt; higher Closeness Example 2 (Bridge Node) Two clusters are connected through X X usually has very high Betweenness, even if its Degree is not the highest Reasoning Path (From Naive to Production) Naive Approach Compute shortest paths for every pair of nodes, then count how often each node is traversed Complexity is too high for large graphs Key Observations Degree uses only local adjacency and is near linear complexity Betweenness can be significantly optimized with Brandes, but remains relatively expensive Closeness fundamentally needs many-source shortest paths, so cost rises rapidly with graph size Engineering Decisions Online: prioritize Degree, add sampled approximate Betweenness when necessary Offline batch: compute fuller Betweenness / Closeness Large graphs: use Top-K + sampling + layered cache together C — Concepts (Core Ideas) Method Categories Degree: local statistics Betweenness: dependency accumulation over global shortest paths Closeness: global distance aggregation Complexity Intuition (Unweighted Graph) Metric Common Algorithm Rough Complexity Degree traverse adjacency list O(V+E) Betweenness Brandes O(VE) Closeness run BFS from every node O(V(V+E)) Practical Conclusion (Key Point) Most online systems reliably support only Degree + approximate Betweenness.\nCloseness is often moved offline or computed only on small subgraphs.\nThe reasons are straightforward:\nDegree is low-cost, highly interpretable, and easy to update incrementally Exact Betweenness is too expensive, while approximation is controllable Closeness is sensitive to connectivity and graph size, making online SLA hard to guarantee Practical Guide / Steps Step 1: Define the Business Question First Looking for highly connected nodes: Degree Looking for critical bridges: Betweenness Looking for global proximity centers: Closeness Step 2: Choose Online or Offline Online services: Degree + approximate Betweenness Offline reporting: add Closeness / refined Betweenness Step 3: Runnable Python Baseline from collections import deque import random def degree_centrality(graph): n = max(len(graph), 1) return {u: len(graph.get(u, [])) / max(n - 1, 1) for u in graph} def bfs_dist(graph, s): dist = {s: 0} q = deque([s]) while q: u = q.popleft() for v in graph.get(u, []): if v not in dist: dist[v] = dist[u] + 1 q.append(v) return dist def closeness_centrality(graph): n = len(graph) cc = {} for u in graph: d = bfs_dist(graph, u) if len(d) \u0026lt;= 1: cc[u] = 0.0 continue s = sum(d.values()) cc[u] = (len(d) - 1) / s if s \u0026gt; 0 else 0.0 # Can be switched to harmonic closeness depending on the use case return cc def approx_betweenness_by_sampling(graph, samples=8, seed=0): random.seed(seed) nodes = list(graph.keys()) if not nodes: return {} score = {u: 0.0 for u in nodes} sample_sources = random.sample(nodes, min(samples, len(nodes))) for s in sample_sources: # Single-source shortest-path DAG + dependency back-propagation (Brandes style) stack = [] pred = {v: [] for v in nodes} sigma = {v: 0.0 for v in nodes} dist = {v: -1 for v in nodes} sigma[s] = 1.0 dist[s] = 0 q = deque([s]) while q: v = q.popleft() stack.append(v) for w in graph.get(v, []): if dist[w] \u0026lt; 0: dist[w] = dist[v] + 1 q.append(w) if dist[w] == dist[v] + 1: sigma[w] += sigma[v] pred[w].append(v) delta = {v: 0.0 for v in nodes} while stack: w = stack.pop() for v in pred[w]: if sigma[w] \u0026gt; 0: delta[v] += (sigma[v] / sigma[w]) * (1.0 + delta[w]) if w != s: score[w] += delta[w] # Sampling normalization (approximation) factor = len(nodes) / max(len(sample_sources), 1) return {u: score[u] * factor for u in nodes} if __name__ == \u0026#34;__main__\u0026#34;: g = { \u0026#34;A\u0026#34;: [\u0026#34;B\u0026#34;], \u0026#34;B\u0026#34;: [\u0026#34;A\u0026#34;, \u0026#34;C\u0026#34;, \u0026#34;E\u0026#34;], \u0026#34;C\u0026#34;: [\u0026#34;B\u0026#34;, \u0026#34;D\u0026#34;], \u0026#34;D\u0026#34;: [\u0026#34;C\u0026#34;], \u0026#34;E\u0026#34;: [\u0026#34;B\u0026#34;], } print(\u0026#34;degree\u0026#34;, degree_centrality(g)) print(\u0026#34;closeness\u0026#34;, closeness_centrality(g)) print(\u0026#34;approx_betweenness\u0026#34;, approx_betweenness_by_sampling(g, samples=3, seed=42)) E — Engineering (Engineering Applications) Scenario 1: Anti-Fraud Hub Account Detection (Degree) Background: in money-transfer graphs, highly connected accounts are often transfer hubs.\nWhy this fits: Degree is fast to compute and suitable as an online risk-control feature.\n# online feature: out-degree / in-degree risk_score = out_degree * 0.6 + in_degree * 0.4 Scenario 2: Critical Bridge Node Alerts (Approximate Betweenness) Background: in social or transaction graphs, some nodes are the only channel between communities.\nWhy this fits: Betweenness finds bridges, but exact computation is expensive; sampled approximation is easier to deploy.\n// pseudo-go style: run sampled Brandes in batch job // 1) sample K sources // 2) accumulate dependency scores // 3) write top-k bridge nodes to Redis/OLAP Scenario 3: Entry Selection for Path Explanation (Closeness) Background: explanation systems may prefer to start path rendering from nodes that are globally closer to core regions.\nWhy this fits: Closeness captures nodes with shorter average distances.\n// Use top-N offline closeness nodes as explanation entry candidates const candidates = centralityRank.slice(0, N); R — Reflection (Reflection and Deeper Analysis) Exact vs Approximate Metric Exact Cost Approximation Strategy Engineering Recommendation Degree Low Not required Compute online directly Betweenness High Source-node sampling, Top-K estimation Read offline results online / batch refresh Closeness Medium-high Subgraph computation, harmonic variant Mostly used for offline analysis Common Wrong Approaches Treating Betweenness as a fully real-time online metric Applying standard Closeness directly to large disconnected graphs without variant handling Ignoring directed vs undirected differences, causing interpretation errors Why \u0026ldquo;Degree + Approximate Betweenness\u0026rdquo; Is Most Common Controllable cost: can satisfy online SLA Strong interpretability: easy for product and business teams to reason about Easy evolution path: launch a usable version first, then add refined offline metrics Explanation and Principles (Why This Works) The engineering essence of centrality is extracting stable, interpretable importance signals at acceptable cost.\nDegree gives local activity Betweenness gives bridge-control power Closeness gives global proximity In real systems, the question is not \u0026ldquo;which metric is theoretically best,\u0026rdquo; but \u0026ldquo;which metric is sustainable under current scale and latency budget.\u0026rdquo;\nFrequently Asked Questions and Notes Can directed and undirected graphs use the same formulas?\nThe high-level idea is shared, but counting conventions differ (in-degree/out-degree, shortest-path direction).\nDoes Betweenness have to be exact?\nNot necessarily. In many cases, approximate ranking is enough, especially for Top-K.\nHow should Closeness be handled on disconnected graphs?\nHarmonic closeness is recommended, or restrict computation to connected subgraphs.\nDo centrality scores need real-time updates?\nMost systems use \u0026ldquo;offline batch refresh + online cache.\u0026rdquo; Only Degree is usually feasible for lightweight real-time increment.\nBest Practices and Recommendations Split centrality into two layers: offline primary computation + online feature service Start from business questions, then pick metrics; avoid a \u0026ldquo;metric-first\u0026rdquo; mindset Put budgets on Betweenness: sample size, time window, Top-K-only outputs For large graphs, split by connected components first to avoid indiscriminate full-graph computation S — Summary (Summary) Core Takeaways Degree, Betweenness, and Closeness capture local connectivity, bridge mediation, and global proximity Betweenness is expensive in production; exact full computation is usually unsuitable online The pragmatic combination in most systems is Degree + approximate Betweenness Closeness is better for offline analysis or small subgraphs Metric choice must obey constraints of scale, latency, and interpretability Recommended Further Reading Ulrik Brandes (2001): A Faster Algorithm for Betweenness Centrality NetworkX centrality documentation (quick experiments) GDS centrality operator design in graph databases (offline batch practice) Metadata Reading time: 12-16 minutes Tags: Graph Theory, Centrality, Degree, Betweenness, Closeness SEO keywords: graph centrality, Degree Centrality, Betweenness, Closeness, approximate Betweenness Meta description: Engineering guide to the graph centrality trio: definitions, complexity, approximation, and rollout strategy, with a focus on why most systems only support Degree and approximate Betweenness. Call To Action (CTA) A practical next step is to do two things:\nLaunch Degree + Top-K first to validate business interpretability Add an offline sampled-Betweenness job and compare ranking stability If you want, I can write the next engineering continuation on \u0026ldquo;PageRank + Community Detection (Louvain).\u0026rdquo;\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) # Degree centrality (unweighted graph) def degree_centrality(graph): n = max(len(graph), 1) return {u: len(graph.get(u, [])) / max(n - 1, 1) for u in graph} /* degree centrality for adjacency matrix (undirected) */ #include \u0026lt;stdio.h\u0026gt; #define N 5 int main(void) { int g[N][N] = { {0,1,0,0,1}, {1,0,1,0,0}, {0,1,0,1,0}, {0,0,1,0,0}, {1,0,0,0,0} }; for (int i = 0; i \u0026lt; N; i++) { int deg = 0; for (int j = 0; j \u0026lt; N; j++) deg += g[i][j]; double cd = (double)deg / (N - 1); printf(\u0026#34;node %d degree_c=%.3f\\n\u0026#34;, i, cd); } return 0; } #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; int main() { vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt; g = { {1,4}, {0,2}, {1,3}, {2}, {0} }; int n = (int)g.size(); for (int u = 0; u \u0026lt; n; ++u) { double c = (double)g[u].size() / max(n - 1, 1); cout \u0026lt;\u0026lt; \u0026#34;node \u0026#34; \u0026lt;\u0026lt; u \u0026lt;\u0026lt; \u0026#34; degree_c=\u0026#34; \u0026lt;\u0026lt; c \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } } package main import \u0026#34;fmt\u0026#34; func main() { g := map[int][]int{0: {1,4}, 1: {0,2}, 2: {1,3}, 3: {2}, 4: {0}} n := len(g) for u, nbrs := range g { cd := float64(len(nbrs)) / float64(n-1) fmt.Printf(\u0026#34;node %d degree_c=%.3f\\n\u0026#34;, u, cd) } } use std::collections::HashMap; fn main() { let mut g: HashMap\u0026lt;i32, Vec\u0026lt;i32\u0026gt;\u0026gt; = HashMap::new(); g.insert(0, vec![1, 4]); g.insert(1, vec![0, 2]); g.insert(2, vec![1, 3]); g.insert(3, vec![2]); g.insert(4, vec![0]); let n = g.len() as f64; for (u, nbrs) in \u0026amp;g { let cd = nbrs.len() as f64 / (n - 1.0); println!(\u0026#34;node {} degree_c={:.3}\u0026#34;, u, cd); } } const g = new Map([ [\u0026#34;A\u0026#34;, [\u0026#34;B\u0026#34;, \u0026#34;E\u0026#34;]], [\u0026#34;B\u0026#34;, [\u0026#34;A\u0026#34;, \u0026#34;C\u0026#34;]], [\u0026#34;C\u0026#34;, [\u0026#34;B\u0026#34;, \u0026#34;D\u0026#34;]], [\u0026#34;D\u0026#34;, [\u0026#34;C\u0026#34;]], [\u0026#34;E\u0026#34;, [\u0026#34;A\u0026#34;]], ]); const n = g.size; for (const [u, nbrs] of g.entries()) { const cd = nbrs.length / (n - 1); console.log(u, \u0026#34;degree_c=\u0026#34;, cd.toFixed(3)); } ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/50-graph-centrality-degree-betweenness-closeness/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nCentrality is not just a paper concept. In graph systems, it is a practical node-importance ranking engine. This article follows the ACERS structure to explain \u003cstrong\u003eDegree / Betweenness / Closeness\u003c/strong\u003e and gives one pragmatic conclusion: \u003cstrong\u003emost online systems reliably support only Degree + approximate Betweenness\u003c/strong\u003e.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 12-16 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eGraph Theory\u003c/code\u003e, \u003ccode\u003eCentrality\u003c/code\u003e, \u003ccode\u003eDegree\u003c/code\u003e, \u003ccode\u003eBetweenness\u003c/code\u003e, \u003ccode\u003eCloseness\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: graph centrality, Degree Centrality, Betweenness, Closeness, approximate Betweenness\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Engineering guide to graph centrality: definitions, complexity, approximation methods, and production strategies, with runnable code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eEngineers working on relationship graph analysis, knowledge graphs, or graph-database query optimization\u003c/li\u003e\n\u003cli\u003eDevelopers who need to turn \u0026ldquo;node importance\u0026rdquo; from concept into production metric\u003c/li\u003e\n\u003cli\u003ePractitioners who want to understand why Betweenness is expensive in production and how to approximate it\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eIn graph systems, you will eventually face questions like these:\u003c/p\u003e","title":"The Graph Centrality Trio: Degree, Betweenness, and Closeness - ACERS Engineering Analysis"},{"content":" Subtitle / Abstract\nConnectivity tells you how a graph is partitioned, while PageRank tells you who matters inside each component. This is one of the core advantages of graph databases over relational databases: not only linking data, but propagating structural importance. This article follows ACERS to explain PageRank / PPR principles and production implementation.\nEstimated reading time: 15-20 minutes Tags: PageRank, PPR, Graph Database, Sparse Matrix SEO keywords: PageRank, Personalized PageRank, sparse matrix, incremental updates, graph database Meta description: From classic PageRank to Personalized PageRank, covering iterative computation, sparse-matrix optimization, and incremental update strategy, with runnable multi-language implementations. Target Audience Engineers building ranking, recommendation, or influence analysis on graph databases Developers who already know BFS/DFS/connected components and want to level up to graph scoring Algorithm engineers focused on iteration performance and update latency on large online graphs Background / Motivation You may already have split graphs into connected components and SCCs, but production systems still face a harder question:\nInside the same component, who is more critical? Given a user or seed node, who is structurally more relevant? This is exactly what PageRank / Personalized PageRank (PPR) is for.\nThis is also a key difference between graph databases and relational databases:\nRelational databases are strong at joins and filtering (row/column view) Graph databases are strong at topological propagation (edge-structure view) At its core, PageRank is probability-mass propagation over a graph, combining local edges and global structure into a rankable score.\nCore Concepts PageRank: global importance score tied to inbound-link quality, not just inbound-link count Personalized PageRank (PPR): biases random walks toward a seed set to obtain personalized importance Damping factor d / alpha: controls whether the walk continues along edges or jumps back to random/seed distribution Sparse matrix: adjacency matrices are extremely sparse at scale; multiplication must use CSR/CSC or adjacency lists Incremental updates: when edges/nodes change, prefer local correction over full recomputation A — Algorithm (Problem and Algorithm) Problem Restatement (Engineering) Given a directed graph G=(V,E), compute node importance scores:\nPageRank: global importance over the full graph PPR: personalized importance relative to seed distribution s Input/Output Name Type Description n int number of nodes edges List[(u,v)] directed edges u -\u0026gt; v d / alpha float damping factor, usually around 0.85 s vector PPR seed distribution (sums to 1) return vector rank score per node Example 1 (PageRank) n = 4 edges = [(0,1),(1,2),(2,0),(2,3)] Output: rank[0..3] Characteristic: 0/1/2 form a cycle, node 3 has only incoming links; score is driven by structure, not simple indegree Example 2 (PPR) Same graph, seed node set to 2 (s[2]=1) Output: ppr[0..3] Characteristic: nodes with stronger reachability from node 2 get higher scores Reasoning Path (From Naive to Usable) Naive Idea 1: Rank by Indegree Problems:\nOnly counts how many nodes point to you, not who they are Many incoming edges from low-quality nodes can mislead the ranking Naive Idea 2: Fixed-Depth Random-Walk Sampling Problems:\nHigh sampling variance and weak stability Hard to provide controllable error guarantees for online services Key Observations Importance should come from votes by high-quality nodes Voting is an iterative propagation process and can be written as linear iteration Graphs are sparse; core cost is sparse multiplication and number of iterations to convergence Method Selection PageRank: global baseline scoring PPR: user/query-seed personalized scoring Engineering focus: iterative computation + sparse storage + incremental updates C — Concepts (Core Ideas) PageRank Formula Let PR_t(u) be the score of node u at iteration t, and Out(v) be outdegree of v:\n[ PR_{t+1}(u)=\\frac{1-d}{N}+d\\sum_{v\\to u}\\frac{PR_t(v)}{Out(v)} ]\nMeaning:\nWith probability 1-d, jump randomly (prevents getting trapped in closed loops) With probability d, propagate importance along edges PPR Formula Given seed distribution s (for example, normalized distribution of nodes clicked by a user):\n[ \\pi_{t+1}=(1-\\alpha)s+\\alpha P^T\\pi_t ]\nMeaning:\nEach iteration returns to seed distribution, so the result has personalized bias When s is uniform, PPR degenerates toward standard PageRank Convergence Criterion Commonly use L1 difference:\n[ |r_{t+1}-r_t|_1\u0026lt;\\varepsilon ]\nIn production, eps is often 1e-6 ~ 1e-8, with max_iter set to avoid long-tail iteration on extreme graphs.\nPractical Guide / Steps Build the graph using adjacency lists or CSR, avoid dense matrices Handle dangling nodes (outdegree = 0) Iteratively update the rank vector Compute per-iteration error and check convergence For large online graphs, prefer warm start (use previous rank as initialization) For local graph changes, use incremental updates instead of full recomputation Runnable Example (Python) from typing import List, Tuple def pagerank(n: int, edges: List[Tuple[int, int]], d: float = 0.85, eps: float = 1e-8, max_iter: int = 100): out = [[] for _ in range(n)] for u, v in edges: out[u].append(v) rank = [1.0 / n] * n for _ in range(max_iter): new_rank = [(1.0 - d) / n for _ in range(n)] dangling_mass = 0.0 for u in range(n): if len(out[u]) == 0: dangling_mass += rank[u] else: share = rank[u] / len(out[u]) for v in out[u]: new_rank[v] += d * share # Redistribute dangling mass uniformly add_back = d * dangling_mass / n for i in range(n): new_rank[i] += add_back diff = sum(abs(new_rank[i] - rank[i]) for i in range(n)) rank = new_rank if diff \u0026lt; eps: break return rank def personalized_pagerank( n: int, edges: List[Tuple[int, int]], seed: List[float], alpha: float = 0.85, eps: float = 1e-8, max_iter: int = 100, ): out = [[] for _ in range(n)] for u, v in edges: out[u].append(v) pi = seed[:] # warm start with seed for _ in range(max_iter): new_pi = [(1.0 - alpha) * seed[i] for i in range(n)] dangling_mass = 0.0 for u in range(n): if len(out[u]) == 0: dangling_mass += pi[u] else: share = pi[u] / len(out[u]) for v in out[u]: new_pi[v] += alpha * share # Inject dangling mass back into seed distribution (more faithful to PPR semantics) for i in range(n): new_pi[i] += alpha * dangling_mass * seed[i] diff = sum(abs(new_pi[i] - pi[i]) for i in range(n)) pi = new_pi if diff \u0026lt; eps: break return pi if __name__ == \u0026#34;__main__\u0026#34;: n = 5 edges = [(0, 1), (1, 2), (2, 0), (2, 3), (3, 4), (4, 2)] pr = pagerank(n, edges) print(\u0026#34;PR:\u0026#34;, [round(x, 6) for x in pr]) seed = [0.0] * n seed[2] = 1.0 ppr = personalized_pagerank(n, edges, seed) print(\u0026#34;PPR(seed=2):\u0026#34;, [round(x, 6) for x in ppr]) Run:\npython3 pagerank_demo.py E — Engineering (Engineering Applications) Scenario 1: Candidate Re-ranking in Recommendation Systems (Python) Background: recall returns 1k candidates; they need graph-structure-aware re-ranking.\nWhy this fits: PPR amplifies graph neighborhoods that are more relevant to the current user.\ndef rerank_by_score(candidates, score): return sorted(candidates, key=lambda x: score.get(x, 0.0), reverse=True) print(rerank_by_score([3, 1, 2], {1: 0.12, 2: 0.35, 3: 0.2})) Scenario 2: Influence Analysis (Go) Background: estimate node influence in social or knowledge propagation graphs.\nWhy this fits: PageRank captures the cascading value of being referenced by important nodes.\npackage main import \u0026#34;fmt\u0026#34; func topK(nodes []int, score map[int]float64, k int) []int { for i := 0; i \u0026lt; len(nodes); i++ { for j := i + 1; j \u0026lt; len(nodes); j++ { if score[nodes[j]] \u0026gt; score[nodes[i]] { nodes[i], nodes[j] = nodes[j], nodes[i] } } } if k \u0026gt; len(nodes) { k = len(nodes) } return nodes[:k] } func main() { nodes := []int{1, 2, 3, 4} score := map[int]float64{1: 0.08, 2: 0.31, 3: 0.12, 4: 0.22} fmt.Println(topK(nodes, score, 2)) } Scenario 3: Incremental Update Pipeline (JavaScript) Background: edges are added/removed daily, so full recomputation every time is too expensive.\nWhy this fits: warm start from old rank plus local updates can significantly cut latency.\nfunction warmStartUpdate(prevRank, deltaEdgesCount) { const factor = Math.max(0.9, 1 - deltaEdgesCount * 0.001); return prevRank.map((x) =\u0026gt; x * factor); } console.log(warmStartUpdate([0.2, 0.3, 0.5], 12)); R — Reflection (Reflection and Deeper Analysis) Complexity Analysis Single-iteration complexity: O(E) Total complexity: O(T * E) (T is iteration count) Space complexity: O(V + E) (adjacency list + rank vectors) Alternatives and Tradeoffs Method Advantages Limitations Indegree ranking Fast to compute Ignores source quality, noisy PageRank Globally stable, interpretable No personalization bias PPR Strong personalization Need per-seed computation, expensive at scale Sampled random walk Parallelizable, flexible approximation Harder variance and stability control Why This Plan Is Most Practical for Engineering Iterative model is simple and easy to batch and monitor Sparse matrix / adjacency lists naturally fit large graphs Warm start and incremental updates support online latency constraints Explanation and Principles (Why This Works) PageRank turns graph structure into a probability-flow conservation problem:\nEach node distributes its current score along outgoing edges Target nodes absorb quality from upstream nodes Damping guarantees traversability and convergence PPR adds a \u0026ldquo;return-to-seed\u0026rdquo; bias to this framework, binding ranking results to user/query context.\nFrequently Asked Questions and Notes Why can convergence be slow?\nalpha may be too high, graph diameter may be large, or dangling nodes may be many; lower alpha, improve preprocessing, and use warm start.\nHow should dangling nodes be handled?\nCommon choices are uniform redistribution of dangling mass, or reinjection into seed distribution for PPR.\nIs online PPR too expensive?\nIt typically needs caching, batched seeds, approximate indexes, or offline precomputation.\nWhen do incremental updates stop working well?\nIf graph structure is heavily reshuffled (large-scale edge rewrites), local correction error accumulates and periodic full recomputation is needed.\nBest Practices and Recommendations Use sparse storage (CSR/CSC or adjacency lists) as default Monitor iteration with residual, max-iteration hit rate, and top-k stability together In online systems, prefer warm start first, then gate incremental vs full by change-size threshold Use PageRank as coarse ranking feature and PPR as personalized weighting feature S — Summary (Summary) Core Takeaways Connected components answer \u0026ldquo;how to split,\u0026rdquo; while PageRank/PPR answer \u0026ldquo;who matters inside each part.\u0026rdquo; PageRank is global structural scoring; PPR is seed-oriented personalized scoring. Production rollout requires all three together: iterative computation, sparse implementation, and incremental update mechanism. For topological propagation tasks, graph databases naturally outperform pure relational join-centric views. To make systems usable online, govern convergence error, compute cost, and update frequency together. Recommended Further Reading Brin, Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine Andersen et al. Local graph partitioning using PageRank vectors Neo4j GDS docs: PageRank / Personalized PageRank Closing / Conclusion PageRank/PPR is not an \u0026ldquo;old algorithm\u0026rdquo; - it is a foundational capability layer in graph computing systems.\nOnly when it is combined with connected components, SCC, and partitioning strategy do you get a complete graph-database engineering loop.\nMetadata Reading time: 15-20 minutes Tags: PageRank, PPR, Graph Database, Recommendation Systems, Incremental Updates SEO keywords: PageRank, Personalized PageRank, sparse matrix, incremental updates Meta description: From PR to PPR, a systematic guide to graph importance propagation and engineering optimization (iteration, sparsity, incrementality). Call To Action (CTA) A practical next step is to do two things:\nRun one full PageRank over your business graph and record top-k stability. Run one incremental update experiment on daily edge changes and compare full vs incremental error/latency. If you want, I can continue with an engineering comparison of HITS / SALSA versus PageRank.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) def pagerank(n, edges, d=0.85, iters=50): out = [[] for _ in range(n)] for u, v in edges: out[u].append(v) r = [1.0 / n] * n for _ in range(iters): nr = [(1 - d) / n] * n dangling = 0.0 for u in range(n): if not out[u]: dangling += r[u] continue share = r[u] / len(out[u]) for v in out[u]: nr[v] += d * share add = d * dangling / n for i in range(n): nr[i] += add r = nr return r #include \u0026lt;stdio.h\u0026gt; void pagerank_demo() { // Minimal demo: production should use CSR/CSC storage double rank[3] = {1.0/3, 1.0/3, 1.0/3}; for (int t = 0; t \u0026lt; 5; ++t) { // Detail omitted: this only demonstrates the iteration framework printf(\u0026#34;iter %d: %.6f %.6f %.6f\\n\u0026#34;, t, rank[0], rank[1], rank[2]); } } int main() { pagerank_demo(); return 0; } #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; vector\u0026lt;double\u0026gt; pagerank(int n, const vector\u0026lt;pair\u0026lt;int,int\u0026gt;\u0026gt;\u0026amp; edges, double d=0.85, int iters=50) { vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt; out(n); for (auto [u,v] : edges) out[u].push_back(v); vector\u0026lt;double\u0026gt; r(n, 1.0 / n); for (int t = 0; t \u0026lt; iters; ++t) { vector\u0026lt;double\u0026gt; nr(n, (1 - d) / n); double dangling = 0.0; for (int u = 0; u \u0026lt; n; ++u) { if (out[u].empty()) { dangling += r[u]; } else { double share = r[u] / out[u].size(); for (int v : out[u]) nr[v] += d * share; } } double add = d * dangling / n; for (int i = 0; i \u0026lt; n; ++i) nr[i] += add; r.swap(nr); } return r; } int main() { vector\u0026lt;pair\u0026lt;int,int\u0026gt;\u0026gt; edges{{0,1},{1,2},{2,0},{2,3}}; auto r = pagerank(4, edges); for (double x : r) cout \u0026lt;\u0026lt; fixed \u0026lt;\u0026lt; setprecision(6) \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } package main import \u0026#34;fmt\u0026#34; func pagerank(n int, edges [][2]int, d float64, iters int) []float64 { out := make([][]int, n) for _, e := range edges { u, v := e[0], e[1] out[u] = append(out[u], v) } r := make([]float64, n) for i := range r { r[i] = 1.0 / float64(n) } for t := 0; t \u0026lt; iters; t++ { nr := make([]float64, n) for i := range nr { nr[i] = (1.0 - d) / float64(n) } dangling := 0.0 for u := 0; u \u0026lt; n; u++ { if len(out[u]) == 0 { dangling += r[u] continue } share := r[u] / float64(len(out[u])) for _, v := range out[u] { nr[v] += d * share } } add := d * dangling / float64(n) for i := range nr { nr[i] += add } r = nr } return r } func main() { edges := [][2]int{{0, 1}, {1, 2}, {2, 0}, {2, 3}} fmt.Println(pagerank(4, edges, 0.85, 50)) } fn pagerank(n: usize, edges: \u0026amp;[(usize, usize)], d: f64, iters: usize) -\u0026gt; Vec\u0026lt;f64\u0026gt; { let mut out = vec![Vec::\u0026lt;usize\u0026gt;::new(); n]; for \u0026amp;(u, v) in edges { out[u].push(v); } let mut r = vec![1.0 / n as f64; n]; for _ in 0..iters { let mut nr = vec![(1.0 - d) / n as f64; n]; let mut dangling = 0.0; for u in 0..n { if out[u].is_empty() { dangling += r[u]; } else { let share = r[u] / out[u].len() as f64; for \u0026amp;v in \u0026amp;out[u] { nr[v] += d * share; } } } let add = d * dangling / n as f64; for x in \u0026amp;mut nr { *x += add; } r = nr; } r } fn main() { let edges = vec![(0, 1), (1, 2), (2, 0), (2, 3)]; let r = pagerank(4, \u0026amp;edges, 0.85, 50); println!(\u0026#34;{:?}\u0026#34;, r); } function pagerank(n, edges, d = 0.85, iters = 50) { const out = Array.from({ length: n }, () =\u0026gt; []); for (const [u, v] of edges) out[u].push(v); let rank = Array(n).fill(1 / n); for (let t = 0; t \u0026lt; iters; t += 1) { const next = Array(n).fill((1 - d) / n); let dangling = 0; for (let u = 0; u \u0026lt; n; u += 1) { if (out[u].length === 0) { dangling += rank[u]; } else { const share = rank[u] / out[u].length; for (const v of out[u]) next[v] += d * share; } } const add = (d * dangling) / n; for (let i = 0; i \u0026lt; n; i += 1) next[i] += add; rank = next; } return rank; } console.log(pagerank(4, [[0, 1], [1, 2], [2, 0], [2, 3]])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/60-pagerank-and-personalized-pagerank/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nConnectivity tells you how a graph is partitioned, while PageRank tells you who matters inside each component. This is one of the core advantages of graph databases over relational databases: not only linking data, but propagating structural importance. This article follows ACERS to explain PageRank / PPR principles and production implementation.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 15-20 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ePageRank\u003c/code\u003e, \u003ccode\u003ePPR\u003c/code\u003e, \u003ccode\u003eGraph Database\u003c/code\u003e, \u003ccode\u003eSparse Matrix\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: PageRank, Personalized PageRank, sparse matrix, incremental updates, graph database\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: From classic PageRank to Personalized PageRank, covering iterative computation, sparse-matrix optimization, and incremental update strategy, with runnable multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eEngineers building ranking, recommendation, or influence analysis on graph databases\u003c/li\u003e\n\u003cli\u003eDevelopers who already know BFS/DFS/connected components and want to level up to graph scoring\u003c/li\u003e\n\u003cli\u003eAlgorithm engineers focused on iteration performance and update latency on large online graphs\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eYou may already have split graphs into connected components and SCCs, but production systems still face a harder question:\u003c/p\u003e","title":"PageRank / Personalized PageRank: Node Importance and Incremental Updates in Graph Databases - ACERS Analysis"},{"content":" Subtitle / Abstract\nThe hard part of graph querying is not \u0026ldquo;whether you can find a path\u0026rdquo;. It is whether you can find it reliably within latency and memory budgets. This article breaks reachability into three layers: online BFS + hop limit, offline closure (usually not fully materialized), and indexed queries (2-hop / reach index), then gives a directly usable engineering decision template.\nEstimated reading time: 12-16 minutes Tags: k-hop, Reachability, BFS, Bitmap Index SEO keywords: k-hop, Reachability, Transitive Closure, 2-hop labeling, reach index Meta description: From online BFS to indexed reachability queries: hop limits, closure cost tradeoffs, and 2-hop/bitmap index selection. Target Audience Engineers working on graph databases, risk-control graphs, dependency analysis, or call-chain troubleshooting Developers who need to turn \u0026ldquo;path existence\u0026rdquo; from interview logic into production capability System designers facing the three-way tension of high query volume, large graphs, and frequent updates Background / Motivation Reachability queries are a core graph-system capability, but production systems face three practical tensions:\nQueries must be fast: typically synchronous inside API paths (millisecond-level) Graphs are large: from millions to hundreds of millions of nodes/edges Updates are frequent: index maintenance cost cannot grow without bound So you should not focus on one algorithm only. You need layered strategies by scenario:\nOnline low-latency: BFS + hop limit + early stop Offline exactness: transitive closure (usually not fully materialized) Query-heavy workloads: bitmap indexes, 2-hop labeling, reach indexes Core Concepts Concept Definition Key Cost Reachability Whether a path u -\u0026gt; v exists Query latency k-hop query Reachable set with path length \u0026lt;= k Frontier expansion size Transitive Closure Full pairwise reachability matrix Precompute and storage cost 2-hop Labeling Reachability decision via hub labels Label build and maintenance complexity Reach Index Family of query-oriented reachability indexes Index size and update cost A — Algorithm (Problem and Algorithm) Problem Restatement (Engineering Abstraction) Given a directed graph G=(V,E), support two query types:\nreachable(u, v): decide whether u can reach v k_hop(u, k): return nodes reachable from u within k hops Constraints:\nQueries must support early stop (target hit, hop limit reached, budget reached) No recursion (deep-graph risk); use iterative implementations Optional: introduce indexes to accelerate high-frequency queries Input / Output Name Type Description graph List[List[int]] Adjacency list, node IDs are 0..n-1 u, v int Source and target k int Max hops Return 1 bool Reachable or not Return 2 Set[int] k-hop neighborhood Example 1: k-hop graph = [ [1,2], # 0 [3], # 1 [3,4], # 2 [5], # 3 [], # 4 [] # 5 ] query: k_hop(0, 2) result: {0,1,2,3,4} Example 2: Reachability query: reachable(0, 5) result: true query: reachable(4, 5) result: false Reasoning Path (From Naive to Engineering-Ready) Naive Option 1: Full-graph BFS for Every Query Correct but not economical Too much repeated computation under high query frequency Naive Option 2: Full Transitive Closure (TC) Query can be O(1) But build and storage are usually too heavy (especially large graph + frequent updates) Key Observations Most online queries need only local scope (k-hop) or early-stop hits Not every graph is worth full closure materialization Indexing should follow the query/update ratio, not blind theoretical optimality Method Selection Prioritize online queries: BFS + hop limit + early stop Static graph with high query density: consider reach indexes (2-hop/bitmap) Dynamic graph with frequent updates: favor lightweight indexes + online search hybrid C — Concepts (Core Ideas) 1) BFS + Hop Limit For k-hop, BFS is the natural model because BFS layers are hop counts.\nState definition: (node, depth)\nPruning rules:\ndepth == k: do not expand neighbors further node == target: return true immediately for reachability visited_budget hits cap: return partial result or degrade 2) Reachability and Transitive Closure Transitive closure can be viewed as a boolean reachability matrix R:\nR[u][v] = 1 iff u reaches v Advantage: extremely fast queries.\nCost: expensive build, large storage, expensive updates.\nEngineering conclusion: usually do not fully materialize unless the graph is relatively static and query volume is high enough to amortize the cost.\n3) Bitmap Indexes / 2-hop Labeling / Reach Indexes 2-hop labeling decision form (directed reachability):\nFor each node x, maintain L_out(x) and L_in(x) u reaches v iff L_out(u) ∩ L_in(v) != ∅ (plus reflexive rules) Pros: very fast queries.\nChallenges: label construction and incremental maintenance are complex, and label size is highly graph-structure dependent.\nCommon engineering compromises:\nBitmap reach index (compressed storage) Hierarchical indexes + online BFS verification Landmark/Bloom prefilter + exact search fallback 4) Minimal Hand-Worked 2-hop Labeling Example Consider the directed graph:\n0 -\u0026gt; 1 -\u0026gt; 3 \\\\ ^ -\u0026gt; 2 ----| A simplified label set (for demonstration):\nL_out(0) = {1,2,3} L_out(1) = {3} L_out(2) = {3} L_in(3) = {0,1,2} For reachable(0,3), check only:\nL_out(0) ∩ L_in(3) = {1,2,3} ∩ {0,1,2} = {1,2} != ∅ So you can return reachable without expanding the full online frontier.\nThis is why 2-hop is common in read-heavy, write-light workloads: move query cost into offline preprocessing.\nPractical Guide / Steps Quantify workload first: QPS, P99, graph scale, update frequency Build a baseline: iterative BFS + hop limit + early stop Add indexes only after load testing: prioritize bitmap index or lightweight reach index If index hits fail, fall back to online BFS In strict-correctness scenarios, Bloom can only prefilter, never decide alone Runnable Python example (python3 reachability_demo.py):\nfrom collections import deque from typing import List, Set def bfs_k_hop(graph: List[List[int]], s: int, k: int) -\u0026gt; Set[int]: vis = bytearray(len(graph)) vis[s] = 1 q = deque([(s, 0)]) out = {s} while q: u, d = q.popleft() if d == k: continue for v in graph[u]: if not vis[v]: vis[v] = 1 out.add(v) q.append((v, d + 1)) return out def reachable_bfs(graph: List[List[int]], s: int, t: int, hop_limit: int | None = None) -\u0026gt; bool: vis = bytearray(len(graph)) vis[s] = 1 q = deque([(s, 0)]) while q: u, d = q.popleft() if u == t: return True if hop_limit is not None and d == hop_limit: continue for v in graph[u]: if not vis[v]: vis[v] = 1 q.append((v, d + 1)) return False def transitive_closure_small(graph: List[List[int]]) -\u0026gt; List[int]: \u0026#34;\u0026#34;\u0026#34;Small-graph demo: one bitset row per node (Python int).\u0026#34;\u0026#34;\u0026#34; n = len(graph) rows = [0] * n for u in range(n): rows[u] |= 1 \u0026lt;\u0026lt; u for v in graph[u]: rows[u] |= 1 \u0026lt;\u0026lt; v # Warshall-bitset: if u reaches k, then u also reaches all nodes that k reaches for k in range(n): mk = 1 \u0026lt;\u0026lt; k rk = rows[k] for u in range(n): if rows[u] \u0026amp; mk: rows[u] |= rk return rows def reachable_by_tc(rows: List[int], u: int, v: int) -\u0026gt; bool: return ((rows[u] \u0026gt;\u0026gt; v) \u0026amp; 1) == 1 if __name__ == \u0026#34;__main__\u0026#34;: g = [[1, 2], [3], [3, 4], [5], [], []] print(\u0026#34;k\u0026lt;=2 from 0:\u0026#34;, sorted(bfs_k_hop(g, 0, 2))) print(\u0026#34;reachable 0-\u0026gt;5:\u0026#34;, reachable_bfs(g, 0, 5)) print(\u0026#34;reachable 4-\u0026gt;5:\u0026#34;, reachable_bfs(g, 4, 5)) tc = transitive_closure_small(g) print(\u0026#34;tc 0-\u0026gt;5:\u0026#34;, reachable_by_tc(tc, 0, 5)) E — Engineering (Applications) Scenario 1: k-hop Expansion on Risk-Control Graphs (Python) Background: Expand from risky seed accounts to accounts within k hops for real-time blocking.\nWhy this fits: BFS layer semantics align naturally with hop rules and budget control.\nfrom collections import deque def risk_expand(graph, seeds, k): vis = set(seeds) q = deque((s, 0) for s in seeds) while q: u, d = q.popleft() if d == k: continue for v in graph[u]: if v not in vis: vis.add(v) q.append((v, d + 1)) return vis Scenario 2: Fast Service-Call Reachability Checks (Go) Background: During incident debugging, determine whether service A can reach service B via call chains.\nWhy this fits: Reachability can stop immediately on hit, which works well for online diagnostics.\npackage main import \u0026#34;fmt\u0026#34; func Reachable(graph [][]int, s, t int) bool { vis := make([]bool, len(graph)) q := []int{s} vis[s] = true for head := 0; head \u0026lt; len(q); head++ { u := q[head] if u == t { return true } for _, v := range graph[u] { if !vis[v] { vis[v] = true q = append(q, v) } } } return false } func main() { g := [][]int{{1, 2}, {3}, {3, 4}, {5}, {}, {}} fmt.Println(Reachable(g, 0, 5)) } Scenario 3: Bitmap Index for Static Dependency Graphs (C++) Background: Build/compile dependency graphs update infrequently, but dependency-existence checks are very frequent.\nWhy this fits: Build bitmap closure once, then answer queries with O(1) bit checks.\n#include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; std::vector\u0026lt;unsigned long long\u0026gt; closure6(const std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt;\u0026amp; g) { int n = (int)g.size(); std::vector\u0026lt;unsigned long long\u0026gt; row(n, 0); for (int u = 0; u \u0026lt; n; ++u) { row[u] |= 1ULL \u0026lt;\u0026lt; u; for (int v : g[u]) row[u] |= 1ULL \u0026lt;\u0026lt; v; } for (int k = 0; k \u0026lt; n; ++k) { unsigned long long mk = 1ULL \u0026lt;\u0026lt; k; for (int u = 0; u \u0026lt; n; ++u) { if (row[u] \u0026amp; mk) row[u] |= row[k]; } } return row; } int main() { std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt; g = {{1,2},{3},{3,4},{5},{},{}}; auto r = closure6(g); std::cout \u0026lt;\u0026lt; (((r[0] \u0026gt;\u0026gt; 5) \u0026amp; 1ULL) ? \u0026#34;reachable\u0026#34; : \u0026#34;not\u0026#34;) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } R — Reflection (Deeper Analysis) Complexity Analysis Let the actually touched subgraph during query be V' nodes and E' edges:\nOnline BFS query: O(V' + E') k-hop query: worst-case still O(V'+E'), but usually much smaller due to k Full closure: BFS from every node: O(n*(n+m)) Boolean-matrix / bitset optimization still has high precompute and storage cost Alternatives and Tradeoffs Approach Query Build Update Best Fit BFS per query Medium None None Frequent updates, low/medium query volume Full closure Very fast Very high Very high Static small/medium graph, high query density 2-hop / reach index Fast Medium-high Medium-high Query-heavy workloads tolerant of offline build Lightweight index + BFS fallback Fast (avg) Medium Medium Common compromise for most online systems Common Mistakes Blindly materializing full closure, making build/storage uncontrollable Using Bloom alone for strict-correctness reachability decisions No hop/budget limits, causing long-tail latency explosions online Why This Is Engineering-Feasible BFS + hop limit gives a low-complexity, low-maintenance baseline Indexes are introduced gradually based on query density, avoiding over-design \u0026ldquo;Index hit + search fallback\u0026rdquo; balances latency and correctness FAQs and Notes Is Reachability the same as shortest path?\nNo. Reachability only asks whether a path exists, not the minimum distance.\nShould Transitive Closure never be computed?\nNot true. It is valuable on static graphs with high query density; most dynamic online graphs just cannot maintain full closure economically.\nIs 2-hop labeling always better than BFS?\nNo. It is query-friendly but heavier to build/maintain, so it fits read-heavy, write-light scenarios.\nBest Practices and Recommendations Ship an observable BFS baseline first (with hop limits, budgets, timeouts) Use real traffic profiles to decide whether a reach index is worth introducing Prioritize maintainability in index design before theoretical optimality Keep a downgrade path: fallback to BFS when indexes degrade or fail S — Summary (Wrap-up) Core Takeaways Reachability is an \u0026ldquo;algorithm + system constraints\u0026rdquo; problem, not a single-optimal-algorithm problem For k-hop, prefer BFS + hop limit + early stop Transitive Closure can make queries fast, but usually should not be fully materialized, especially on dynamic graphs 2-hop labeling / reach indexes are best for read-heavy, write-light workloads The most stable production pattern is usually \u0026ldquo;lightweight index + online BFS fallback\u0026rdquo; Recommended Further Reading LeetCode 1971 (Find if Path Exists in Graph) LeetCode 847 (Shortest Path Visiting All Nodes, state-space search extension) Graph database query-optimization docs (Neo4j / JanusGraph neighborhood query strategies) Classic reachability-index papers (2-hop labeling / GRAIL) Metadata Reading time: 12-16 minutes Tags: Reachability, k-hop, BFS, 2-hop labeling SEO keywords: Reachability, k-hop, Transitive Closure, 2-hop labeling, reach index Meta description: Engineering reachability query strategy: BFS+hop limits, closure tradeoffs, bitmap/2-hop indexing, and online fallback. Call to Action (CTA) Two practical next steps:\nAdd hop_limit and visit_budget parameters to your current reachability API Run an A/B load test on real traffic: \u0026ldquo;BFS per query\u0026rdquo; vs \u0026ldquo;lightweight index + fallback\u0026rdquo; If you want, I can write the next post as well: \u0026ldquo;Reachability Index Implementation Guide: when to pick 2-hop, when to pick GRAIL, and when to stick with BFS.\u0026rdquo;\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from collections import deque def reachable(graph, s, t): vis = [False] * len(graph) q = deque([s]) vis[s] = True while q: u = q.popleft() if u == t: return True for v in graph[u]: if not vis[v]: vis[v] = True q.append(v) return False def k_hop(graph, s, k): vis = [False] * len(graph) q = deque([(s, 0)]) vis[s] = True out = {s} while q: u, d = q.popleft() if d == k: continue for v in graph[u]: if not vis[v]: vis[v] = True out.add(v) q.append((v, d + 1)) return out #include \u0026lt;stdbool.h\u0026gt; #include \u0026lt;stdio.h\u0026gt; #define N 6 bool reachable(int g[N][N], int s, int t) { int q[128], head = 0, tail = 0; bool vis[N] = {0}; q[tail++] = s; vis[s] = true; while (head \u0026lt; tail) { int u = q[head++]; if (u == t) return true; for (int v = 0; v \u0026lt; N; ++v) { if (g[u][v] \u0026amp;\u0026amp; !vis[v]) { vis[v] = true; q[tail++] = v; } } } return false; } int main(void) { int g[N][N] = {0}; g[0][1] = g[0][2] = 1; g[1][3] = 1; g[2][3] = g[2][4] = 1; g[3][5] = 1; printf(\u0026#34;%d\\n\u0026#34;, reachable(g, 0, 5)); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;queue\u0026gt; #include \u0026lt;vector\u0026gt; bool reachable(const std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt;\u0026amp; g, int s, int t) { std::vector\u0026lt;char\u0026gt; vis(g.size(), 0); std::queue\u0026lt;int\u0026gt; q; vis[s] = 1; q.push(s); while (!q.empty()) { int u = q.front(); q.pop(); if (u == t) return true; for (int v : g[u]) { if (!vis[v]) { vis[v] = 1; q.push(v); } } } return false; } int main() { std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt; g = {{1,2},{3},{3,4},{5},{},{}}; std::cout \u0026lt;\u0026lt; reachable(g, 0, 5) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } package main import \u0026#34;fmt\u0026#34; func reachable(graph [][]int, s, t int) bool { vis := make([]bool, len(graph)) q := []int{s} vis[s] = true for head := 0; head \u0026lt; len(q); head++ { u := q[head] if u == t { return true } for _, v := range graph[u] { if !vis[v] { vis[v] = true q = append(q, v) } } } return false } func main() { g := [][]int{{1, 2}, {3}, {3, 4}, {5}, {}, {}} fmt.Println(reachable(g, 0, 5)) } use std::collections::VecDeque; fn reachable(graph: \u0026amp;Vec\u0026lt;Vec\u0026lt;usize\u0026gt;\u0026gt;, s: usize, t: usize) -\u0026gt; bool { let mut vis = vec![false; graph.len()]; let mut q = VecDeque::new(); vis[s] = true; q.push_back(s); while let Some(u) = q.pop_front() { if u == t { return true; } for \u0026amp;v in \u0026amp;graph[u] { if !vis[v] { vis[v] = true; q.push_back(v); } } } false } fn main() { let g = vec![vec![1, 2], vec![3], vec![3, 4], vec![5], vec![], vec![]]; println!(\u0026#34;{}\u0026#34;, reachable(\u0026amp;g, 0, 5)); } function reachable(graph, s, t) { const vis = Array(graph.length).fill(false); const q = [s]; let head = 0; vis[s] = true; while (head \u0026lt; q.length) { const u = q[head++]; if (u === t) return true; for (const v of graph[u]) { if (!vis[v]) { vis[v] = true; q.push(v); } } } return false; } const g = [[1, 2], [3], [3, 4], [5], [], []]; console.log(reachable(g, 0, 5)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/30-k-hop-reachability-and-reach-index/","summary":"This article walks through k-hop and reachability queries in practice: BFS+hop limits, transitive-closure tradeoffs, and engineering rollout paths for bitmap indexes and 2-hop labeling.","title":"k-hop and Reachability Queries: BFS Limits, Reachability Indexes, and 2-hop Labeling ACERS Analysis"},{"content":" Subtitle / Abstract\nComponents are foundational for graph algorithms: undirected graphs ask \u0026ldquo;are nodes connected,\u0026rdquo; while directed graphs ask \u0026ldquo;are nodes mutually reachable.\u0026rdquo; Following the ACERS template, this article moves from naive methods to Tarjan / Kosaraju, then shows production graph-database use cases with runnable multi-language code.\nEstimated reading time: 14-18 minutes Tags: Graph Theory, Connected Components, SCC, Tarjan SEO keywords: Connected Components, SCC, Tarjan, Kosaraju, graph database Meta description: From undirected connected components to directed SCCs, with clear Tarjan/Kosaraju mechanics, complexity, and production rollout guidance. Target Audience Learners who need BFS/DFS to become second nature Engineers doing subgraph analysis and partition planning in graph-database systems Intermediate developers who want one unified framework for \u0026ldquo;undirected CC + directed SCC\u0026rdquo; Background / Motivation In production, you quickly hit three types of questions:\nDo these nodes naturally split into disconnected groups? (undirected connected components) Which nodes form mutually reachable strong cycles? (directed SCC) How can a large graph be split into subgraphs that are more parallelizable, cache-friendly, and shard-friendly? If you only know BFS/DFS but not the \u0026ldquo;component view,\u0026rdquo; you end up repeating reachability queries with high cost and weak maintainability.\nThe value of component algorithms is: one full-graph scan turns many local queries into O(1) component-ID comparisons.\nCore Concepts Connected Components (CC): maximal node sets in an undirected graph where any two nodes are reachable Strongly Connected Components (SCC): maximal node sets in a directed graph where any two nodes are mutually reachable Condensation DAG: DAG obtained by contracting each SCC into a single node Tarjan core state: dfn[u] (timestamp), low[u] (minimum reachable timestamp), stack and in_stack Kosaraju core flow: DFS finish-order on original graph + second DFS on reversed graph A — Algorithm (Problem and Algorithm) Problem Restatement (Engineering Formulation) Given a graph G=(V,E):\nIf G is undirected, output all Connected Components; If G is directed, output all Strongly Connected Components. Also return:\nThe number of components The component ID of every node Input / Output Name Type Description n int Number of nodes (0..n-1) edges List[(u,v)] Edge list directed bool Directed or not Return (k, comp_id[]) k is component count; comp_id[i] is node i\u0026rsquo;s component ID Example 1 (Undirected CC) n = 7 edges = [(0,1),(1,2),(3,4),(5,6)] Connected components: {0,1,2}, {3,4}, {5,6} k = 3 Example 2 (Directed SCC) n = 6 edges = [(0,1),(1,2),(2,0),(2,3),(3,4),(4,3),(4,5)] Strongly connected components: {0,1,2}, {3,4}, {5} k = 3 Reasoning Path (From Naive to Optimal) Naive Approach Run one reachability search (BFS/DFS) from every node Then merge or cross-compare the resulting sets Problems:\nTime complexity inflates to O(V*(V+E)) Repeated scanning over the same edges hurts cache locality and throughput Key Observations Undirected graph: from one unvisited node, a single BFS/DFS can consume one full connected component. Directed graph: one-way reachability is not enough; you need equivalence classes of mutual reachability (SCC). Method Selection Undirected graph: iterative BFS/DFS + visited (most robust) Directed graph: Tarjan (single DFS pass, more common in production) Kosaraju: very intuitive implementation, useful for cross-checking and education C — Concepts (Core Ideas) Method Categories Graph traversal: BFS / DFS Component decomposition: Connected Components / SCC Condensation modeling: SCC -\u0026gt; DAG Tarjan Invariants Maintain during DFS:\ndfn[u]: timestamp when node u is first visited low[u]: smallest reachable dfn from u via tree edges + back edges When dfn[u] == low[u], u is the root of one SCC. Pop until u to materialize that SCC.\nEssence of Kosaraju Record postorder by DFS finish time on the original graph Build the reversed graph Run DFS on the reversed graph in reverse postorder; each DFS yields one SCC Why Tarjan Is Common in Engineering One DFS pass for SCC decomposition (no explicit reverse-graph build required) Smaller constant factors and more direct memory behavior Easier to combine with online stats (for example SCC size thresholds) Practical Guide / Steps Undirected Connected Components (Iterative) Build adjacency list Start one stack/queue traversal from every unvisited node Assign comp_id during traversal Optional early stop: If only checking whether two nodes are in the same component, stop once confirmed If only building a k-hop subgraph, limit traversal depth Directed SCC (Tarjan) Maintain global timestamp time Push node on DFS entry and initialize dfn/low Recurse into unvisited neighbors; for stack neighbors, update low When dfn==low, pop a full SCC Engineering Choice for visited bitmap: exact, predictable, good for fixed ID spaces bloom filter: memory-saving but with false positives; suitable for approximate dedup, not strict-correctness traversal Runnable Example (Python) from collections import deque from typing import List, Tuple def connected_components_undirected(n: int, edges: List[Tuple[int, int]]) -\u0026gt; Tuple[int, List[int]]: graph = [[] for _ in range(n)] for u, v in edges: graph[u].append(v) graph[v].append(u) comp = [-1] * n cid = 0 for start in range(n): if comp[start] != -1: continue queue = deque([start]) comp[start] = cid while queue: u = queue.popleft() for v in graph[u]: if comp[v] == -1: comp[v] = cid queue.append(v) cid += 1 return cid, comp def scc_tarjan(n: int, edges: List[Tuple[int, int]]) -\u0026gt; Tuple[int, List[int]]: graph = [[] for _ in range(n)] for u, v in edges: graph[u].append(v) dfn = [-1] * n low = [0] * n in_stack = [False] * n stack = [] comp = [-1] * n time = 0 scc_id = 0 def dfs(u: int) -\u0026gt; None: nonlocal time, scc_id dfn[u] = low[u] = time time += 1 stack.append(u) in_stack[u] = True for v in graph[u]: if dfn[v] == -1: dfs(v) low[u] = min(low[u], low[v]) elif in_stack[v]: low[u] = min(low[u], dfn[v]) if dfn[u] == low[u]: while True: x = stack.pop() in_stack[x] = False comp[x] = scc_id if x == u: break scc_id += 1 for i in range(n): if dfn[i] == -1: dfs(i) return scc_id, comp if __name__ == \u0026#34;__main__\u0026#34;: n1 = 7 e1 = [(0, 1), (1, 2), (3, 4), (5, 6)] k1, c1 = connected_components_undirected(n1, e1) print(\u0026#34;Undirected CC count:\u0026#34;, k1, \u0026#34;comp:\u0026#34;, c1) n2 = 6 e2 = [(0, 1), (1, 2), (2, 0), (2, 3), (3, 4), (4, 3), (4, 5)] k2, c2 = scc_tarjan(n2, e2) print(\u0026#34;Directed SCC count:\u0026#34;, k2, \u0026#34;comp:\u0026#34;, c2) Run:\npython3 connected_components_demo.py E — Engineering (Applications) Scenario 1: Community Pre-grouping in Graph Databases (Python) Background: Before community analysis on a user-relationship graph, first remove isolated disconnected blocks.\nWhy this fits: Running CC first directly narrows the scope for later algorithms (for example Louvain).\ndef group_by_component(node_ids, comp_ids): groups = {} for node, cid in zip(node_ids, comp_ids): groups.setdefault(cid, []).append(node) return groups Scenario 2: Subgraph Splitting for Parallel Task Dispatch (Go) Background: Split offline graph-compute tasks by component across workers to reduce cross-worker communication.\nWhy this fits: Components are naturally independent, so tasks parallelize without cross-dependencies.\npackage main import \u0026#34;fmt\u0026#34; func bucketByComp(comp []int) map[int][]int { b := map[int][]int{} for node, cid := range comp { b[cid] = append(b[cid], node) } return b } func main() { comp := []int{0, 0, 1, 1, 2} fmt.Println(bucketByComp(comp)) } Scenario 3: Partition Hints from Component IDs (JavaScript) Background: In online graph services, you want highly coupled nodes placed on the same shard as much as possible.\nWhy this fits: SCC/CC IDs are strong signals that reduce cross-shard edge ratio.\nfunction assignShardByComp(compIds, shardCount) { return compIds.map((cid) =\u0026gt; cid % shardCount); } console.log(assignShardByComp([0, 0, 1, 1, 2, 2], 2)); R — Reflection (Deeper Analysis) Complexity Analysis Undirected CC (BFS/DFS): O(V+E), space O(V) Tarjan SCC: O(V+E), space O(V) Kosaraju SCC: O(V+E), space O(V+E) (includes reverse graph) Alternatives and Tradeoffs Method Graph Type Time Complexity Pros Limits BFS/DFS components Undirected O(V+E) Intuitive, stable Cannot handle SCC Tarjan Directed O(V+E) Single pass, production-friendly Harder than plain BFS to implement Kosaraju Directed O(V+E) Clear mental model Needs reverse graph and two DFS passes Union-Find Static undirected connectivity approx O(E α(V)) Quick to implement Not suitable for SCC Why Tarjan Is More Engineering-Feasible Better fit for online pipelines: one pass can emit SCC IDs directly No reverse-graph build required, reducing extra memory and data movement Easy to attach additional metrics: SCC size, out-edge count, inter-SCC edge ratio Explanation and Principles (Why This Works) The essence of CC is \u0026ldquo;undirected reachability equivalence classes\u0026rdquo;; one traversal can fully cover one class. The essence of SCC is \u0026ldquo;directed mutual-reachability equivalence classes\u0026rdquo;; Tarjan uses dfn/low + stack to identify cycle roots online. After mapping nodes to comp_id, many queries are reduced in dimension: \u0026ldquo;Are these in the same group?\u0026rdquo; =\u0026gt; comp_id[u] == comp_id[v] \u0026ldquo;Partition hint?\u0026rdquo; =\u0026gt; hash(comp_id) FAQs and Notes Can Tarjan be used to compute SCC on undirected graphs?\nYes, but unnecessary; direct CC is simpler for undirected graphs.\nMust Tarjan be recursive?\nNo. It can be converted to an explicit-stack iterative version, but implementation complexity is higher.\nCan Bloom filter replace visited?\nNot for strict-correctness scenarios; false positives can skip nodes that should be traversed.\nWhy is my SCC ordering different?\nAs long as SCC partitioning is correct, numbering order can vary with traversal order.\nBest Practices and Recommendations Normalize node IDs to 0..n-1 before graph algorithms to avoid mapping bugs Prefer iterative BFS/DFS for undirected CC to avoid deep-recursion stack risk For large directed graphs, prioritize Tarjan; add Kosaraju when teaching or cross-validating Persist comp_id in production and reuse it for query, cache, and sharding decisions S — Summary (Wrap-up) Core Takeaways Components are the first \u0026ldquo;dimensionality reduction layer\u0026rdquo; in graph computing; one computation supports many query types. Undirected CC and directed SCC are different problems and cannot be mixed. Tarjan identifies SCC online with dfn/low in O(V+E), which is why it is production-common. Kosaraju is great for understanding and cross-validation; Tarjan is usually better for production rollout. In graph databases, comp_id can directly support coarse community grouping, subgraph splitting, and partition hints. Recommended Further Reading Tarjan, R. (1972). Depth-first search and linear graph algorithms. CLRS graph chapters (SCC, topological sorting) Neo4j Graph Data Science docs: Connected Components / SCC Closing Note If you already know BFS/DFS, the next required step is component thinking.\nIn engineering, the real value is not \u0026ldquo;can traverse,\u0026rdquo; but turning traversal outputs into stable, reusable structured labels (comp_id).\nMetadata Reading time: 14-18 minutes Tags: Graph Theory, Connected Components, SCC, Tarjan, Graph Database SEO keywords: Connected Components, SCC, Tarjan, Kosaraju, Graph Partitioning Meta description: A systematic guide to undirected CC and directed SCC, focused on Tarjan/Kosaraju and graph-database engineering rollout. Call to Action (CTA) Two immediate actions:\nRun CC / SCC on your production graph and output a histogram of comp_id distribution. Measure cross-component edge ratio and evaluate partitioning or subgraph-level parallelization. If you want, I can continue with \u0026ldquo;3) Shortest Paths (Dijkstra / A* / Multi-source BFS)\u0026rdquo; in the same ACERS style.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from collections import deque def connected_components_undirected(n, edges): g = [[] for _ in range(n)] for u, v in edges: g[u].append(v) g[v].append(u) comp = [-1] * n cid = 0 for s in range(n): if comp[s] != -1: continue q = deque([s]) comp[s] = cid while q: u = q.popleft() for v in g[u]: if comp[v] == -1: comp[v] = cid q.append(v) cid += 1 return cid, comp #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef struct { int* data; int size; int cap; } Vec; void push(Vec* v, int x) { if (v-\u0026gt;size == v-\u0026gt;cap) { v-\u0026gt;cap = v-\u0026gt;cap ? v-\u0026gt;cap * 2 : 4; v-\u0026gt;data = (int*)realloc(v-\u0026gt;data, sizeof(int) * v-\u0026gt;cap); } v-\u0026gt;data[v-\u0026gt;size++] = x; } int main(void) { int n = 5; int comp[5] = {-1, -1, -1, -1, -1}; // Demo placeholder: in real systems, build adjacency by edges and run BFS/DFS. comp[0] = comp[1] = 0; comp[2] = comp[3] = 1; comp[4] = 2; for (int i = 0; i \u0026lt; n; ++i) printf(\u0026#34;node %d -\u0026gt; comp %d\\n\u0026#34;, i, comp[i]); return 0; } #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; pair\u0026lt;int, vector\u0026lt;int\u0026gt;\u0026gt; connectedComponentsUndirected(int n, const vector\u0026lt;pair\u0026lt;int,int\u0026gt;\u0026gt;\u0026amp; edges) { vector\u0026lt;vector\u0026lt;int\u0026gt;\u0026gt; g(n); for (auto [u,v] : edges) { g[u].push_back(v); g[v].push_back(u); } vector\u0026lt;int\u0026gt; comp(n, -1); int cid = 0; queue\u0026lt;int\u0026gt; q; for (int s = 0; s \u0026lt; n; ++s) { if (comp[s] != -1) continue; comp[s] = cid; q.push(s); while (!q.empty()) { int u = q.front(); q.pop(); for (int v : g[u]) { if (comp[v] == -1) { comp[v] = cid; q.push(v); } } } cid++; } return {cid, comp}; } int main() { vector\u0026lt;pair\u0026lt;int,int\u0026gt;\u0026gt; edges = {{0,1},{1,2},{3,4}}; auto [k, comp] = connectedComponentsUndirected(5, edges); cout \u0026lt;\u0026lt; \u0026#34;k=\u0026#34; \u0026lt;\u0026lt; k \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; for (int i = 0; i \u0026lt; (int)comp.size(); ++i) cout \u0026lt;\u0026lt; i \u0026lt;\u0026lt; \u0026#34;:\u0026#34; \u0026lt;\u0026lt; comp[i] \u0026lt;\u0026lt; \u0026#34; \u0026#34;; cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } package main import \u0026#34;fmt\u0026#34; func connectedComponentsUndirected(n int, edges [][2]int) (int, []int) { g := make([][]int, n) for _, e := range edges { u, v := e[0], e[1] g[u] = append(g[u], v) g[v] = append(g[v], u) } comp := make([]int, n) for i := range comp { comp[i] = -1 } cid := 0 q := make([]int, 0) for s := 0; s \u0026lt; n; s++ { if comp[s] != -1 { continue } comp[s] = cid q = append(q, s) for len(q) \u0026gt; 0 { u := q[0] q = q[1:] for _, v := range g[u] { if comp[v] == -1 { comp[v] = cid q = append(q, v) } } } cid++ } return cid, comp } func main() { edges := [][2]int{{0, 1}, {1, 2}, {3, 4}} k, comp := connectedComponentsUndirected(5, edges) fmt.Println(k, comp) } use std::collections::VecDeque; fn connected_components_undirected(n: usize, edges: \u0026amp;[(usize, usize)]) -\u0026gt; (usize, Vec\u0026lt;i32\u0026gt;) { let mut g = vec![vec![]; n]; for \u0026amp;(u, v) in edges { g[u].push(v); g[v].push(u); } let mut comp = vec![-1; n]; let mut cid: i32 = 0; for s in 0..n { if comp[s] != -1 { continue; } let mut q = VecDeque::new(); comp[s] = cid; q.push_back(s); while let Some(u) = q.pop_front() { for \u0026amp;v in \u0026amp;g[u] { if comp[v] == -1 { comp[v] = cid; q.push_back(v); } } } cid += 1; } (cid as usize, comp) } fn main() { let edges = vec![(0, 1), (1, 2), (3, 4)]; let (k, comp) = connected_components_undirected(5, \u0026amp;edges); println!(\u0026#34;{} {:?}\u0026#34;, k, comp); } function connectedComponentsUndirected(n, edges) { const g = Array.from({ length: n }, () =\u0026gt; []); for (const [u, v] of edges) { g[u].push(v); g[v].push(u); } const comp = Array(n).fill(-1); let cid = 0; for (let s = 0; s \u0026lt; n; s += 1) { if (comp[s] !== -1) continue; const queue = [s]; comp[s] = cid; while (queue.length) { const u = queue.shift(); for (const v of g[u]) { if (comp[v] === -1) { comp[v] = cid; queue.push(v); } } } cid += 1; } return [cid, comp]; } console.log(connectedComponentsUndirected(5, [[0, 1], [1, 2], [3, 4]])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/40-connected-components-and-scc-tarjan-kosaraju/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nComponents are foundational for graph algorithms: undirected graphs ask \u0026ldquo;are nodes connected,\u0026rdquo; while directed graphs ask \u0026ldquo;are nodes mutually reachable.\u0026rdquo; Following the ACERS template, this article moves from naive methods to Tarjan / Kosaraju, then shows production graph-database use cases with runnable multi-language code.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 14-18 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eGraph Theory\u003c/code\u003e, \u003ccode\u003eConnected Components\u003c/code\u003e, \u003ccode\u003eSCC\u003c/code\u003e, \u003ccode\u003eTarjan\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Connected Components, SCC, Tarjan, Kosaraju, graph database\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: From undirected connected components to directed SCCs, with clear Tarjan/Kosaraju mechanics, complexity, and production rollout guidance.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners who need BFS/DFS to become second nature\u003c/li\u003e\n\u003cli\u003eEngineers doing subgraph analysis and partition planning in graph-database systems\u003c/li\u003e\n\u003cli\u003eIntermediate developers who want one unified framework for \u0026ldquo;undirected CC + directed SCC\u0026rdquo;\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eIn production, you quickly hit three types of questions:\u003c/p\u003e","title":"Connected Components and Strongly Connected Components: Tarjan / Kosaraju ACERS Engineering Analysis"},{"content":" Subtitle / Abstract\nShortest path is not one question. It is an engineering skill set: choose the right algorithm by graph conditions. This ACERS article breaks down BFS (unweighted) / Dijkstra (non-negative weights) / A (heuristic)* and gives optimization templates you actually use in relationship graphs, recommendation paths, and path explanations.\nEstimated reading time: 14-18 minutes Tags: Graph Theory, shortest path, BFS, Dijkstra, A* SEO keywords: shortest path, BFS, Dijkstra, A*, bidirectional search, multi-source BFS Meta description: Engineering guide to the shortest-path core trio: algorithm boundaries, complexity, runnable code, optimization strategies, and practical scenarios. Target Audience Learners reinforcing graph fundamentals who want reusable engineering templates Backend/algorithm engineers working on social links, recommendation paths, or graph-query explanations Developers who know BFS, Dijkstra, and A* by name but still struggle with robust selection and optimization Background / Motivation Shortest-path problems are common in:\nMinimal relationship chains in social networks (how many hops apart) Minimum-cost paths in recommendation systems (multi-objective trade-offs) \u0026ldquo;Why this was recommended\u0026rdquo; path displays in explainability systems The most common production mistake is forcing one algorithm onto every scenario:\nRunning BFS on weighted graphs (wrong result, no explicit error) Running Dijkstra on negative edges (unreliable result) Using A* with a poor heuristic (performance degrades to Dijkstra) In essence, shortest-path solutions should start with graph-condition classification, then algorithm selection.\nCore Concepts Algorithm Suitable Graph Optimality Condition Typical Complexity Keywords BFS Unweighted / equal-weight graph First arrival by layer gives minimum edge count O(V+E) queue, level Dijkstra Non-negative weighted graph Node popped from heap already has optimal distance O((V+E)logV) relaxation, min-heap A* Non-negative weighted graph + heuristic h(n) is admissible (never overestimates) Worst case same as Dijkstra, usually faster f=g+h Key formulas:\nDijkstra relaxation: update when dist[v] \u0026gt; dist[u] + w(u,v) A evaluation function*: f(n) = g(n) + h(n) Where:\ng(n) is known cost from start to n h(n) is heuristic estimated cost from n to target A - Algorithm (Problem and Algorithm) Unified Problem Model Given graph G=(V,E), start node s, and target node t, find both shortest path length and path from s to t.\nThe graph may be unweighted, or weighted with non-negative edge weights.\nInput and Output Name Type Description graph Adjacency list Graph structure; graph[u] is neighbors or (neighbor, weight) s Node ID Start node t Node ID Target node Return Distance + path Return INF/null or empty path when unreachable Example 1 (Unweighted graph) A -\u0026gt; B -\u0026gt; D A -\u0026gt; C -\u0026gt; D minimum edge count from A to D = 2 valid path: A-B-D or A-C-D Example 2 (Non-negative weighted graph) A -\u0026gt; B (2) A -\u0026gt; C (5) B -\u0026gt; C (1) B -\u0026gt; D (4) C -\u0026gt; D (1) minimum cost from A to D = 4 path: A-B-C-D Derivation (From naive to production-ready) Naive idea: enumerate all paths Use DFS to enumerate all s -\u0026gt; t paths, then choose minimum In cyclic graphs, dedup logic is complex and path count can be exponential Conclusion: not practical except on tiny graphs.\nKey Observation 1: if edge weights are equal, layer index is cost Shortest path reduces to \u0026ldquo;fewest edges\u0026rdquo; BFS expands by layers; first reach of target is optimal Key Observation 2: with non-negative weights, greedy shortest-prefix expansion works Dijkstra pops the node with current minimum dist With non-negative weights, popped nodes cannot be improved later Key Observation 3: if you can estimate how far a node is from target, search shrinks A* adds heuristic h(n) on top of Dijkstra Search is guided toward target, reducing irrelevant expansion C - Concepts (Core Ideas) Method Categories BFS: layered traversal + minimum hop count Dijkstra: shortest-path tree + relaxation + min-heap A*: shortest path + heuristic best-first search Relationship Among the Three Dijkstra is A* with h(n)=0 BFS is Dijkstra when all edge weights are 1 A* performance depends heavily on heuristic quality: Too weak: degrades to Dijkstra Too aggressive and overestimating: may lose optimality Engineering Selection Matrix Problem Feature Preferred Algorithm Notes Unweighted graph / minimum hops BFS Relationship chains, k-hop search Minimum cost with non-negative weights Dijkstra Stable default for backend services Non-negative weights + usable heuristic A* Road networks, spatial graphs, explanation paths Negative edges present Bellman-Ford/Johnson Do not use Dijkstra/A* Practical Guide / Steps Step 1: classify graph conditions first Unweighted or equal-weight? Yes -\u0026gt; BFS Any negative edges? Yes -\u0026gt; cannot use Dijkstra/A* Usable heuristic available? Yes -\u0026gt; prefer A* Step 2: unify path reconstruction interface Maintain parent mapping: parent[v] = u, then backtrack from t to s.\nStep 3: implement runnable template (Python) from collections import deque import heapq from math import inf def reconstruct_path(parent, s, t): if t not in parent and s != t: return [] path = [t] while path[-1] != s: path.append(parent[path[-1]]) path.reverse() return path def bfs_shortest_path(graph, s, t, max_depth=None): \u0026#34;\u0026#34;\u0026#34;graph[u] = [v1, v2, ...]\u0026#34;\u0026#34;\u0026#34; q = deque([(s, 0)]) parent = {s: s} visited = {s} while q: u, d = q.popleft() if u == t: return d, reconstruct_path(parent, s, t) if max_depth is not None and d \u0026gt;= max_depth: continue for v in graph.get(u, []): if v not in visited: visited.add(v) parent[v] = u q.append((v, d + 1)) return inf, [] def dijkstra_shortest_path(graph, s, t, max_cost=None): \u0026#34;\u0026#34;\u0026#34;graph[u] = [(v, w), ...], w \u0026gt;= 0\u0026#34;\u0026#34;\u0026#34; dist = {s: 0.0} parent = {s: s} pq = [(0.0, s)] while pq: du, u = heapq.heappop(pq) if du != dist.get(u, inf): continue if max_cost is not None and du \u0026gt; max_cost: continue if u == t: return du, reconstruct_path(parent, s, t) for v, w in graph.get(u, []): nd = du + w if nd \u0026lt; dist.get(v, inf): dist[v] = nd parent[v] = u heapq.heappush(pq, (nd, v)) return inf, [] def astar_shortest_path(graph, s, t, h): \u0026#34;\u0026#34;\u0026#34;h(u) is admissible heuristic estimate from u to t\u0026#34;\u0026#34;\u0026#34; g = {s: 0.0} parent = {s: s} pq = [(h(s), s)] # (f, node) while pq: f, u = heapq.heappop(pq) if u == t: return g[u], reconstruct_path(parent, s, t) for v, w in graph.get(u, []): ng = g[u] + w if ng \u0026lt; g.get(v, inf): g[v] = ng parent[v] = u heapq.heappush(pq, (ng + h(v), v)) return inf, [] if __name__ == \u0026#34;__main__\u0026#34;: unweighted = { \u0026#34;A\u0026#34;: [\u0026#34;B\u0026#34;, \u0026#34;C\u0026#34;], \u0026#34;B\u0026#34;: [\u0026#34;D\u0026#34;], \u0026#34;C\u0026#34;: [\u0026#34;D\u0026#34;], \u0026#34;D\u0026#34;: [], } print(bfs_shortest_path(unweighted, \u0026#34;A\u0026#34;, \u0026#34;D\u0026#34;)) # (2, [\u0026#39;A\u0026#39;, \u0026#39;B\u0026#39;, \u0026#39;D\u0026#39;]) or C path weighted = { \u0026#34;A\u0026#34;: [(\u0026#34;B\u0026#34;, 2), (\u0026#34;C\u0026#34;, 5)], \u0026#34;B\u0026#34;: [(\u0026#34;C\u0026#34;, 1), (\u0026#34;D\u0026#34;, 4)], \u0026#34;C\u0026#34;: [(\u0026#34;D\u0026#34;, 1)], \u0026#34;D\u0026#34;: [], } print(dijkstra_shortest_path(weighted, \u0026#34;A\u0026#34;, \u0026#34;D\u0026#34;)) # (4.0, [\u0026#39;A\u0026#39;, \u0026#39;B\u0026#39;, \u0026#39;C\u0026#39;, \u0026#39;D\u0026#39;]) heuristic = {\u0026#34;A\u0026#34;: 3, \u0026#34;B\u0026#34;: 2, \u0026#34;C\u0026#34;: 1, \u0026#34;D\u0026#34;: 0} print(astar_shortest_path(weighted, \u0026#34;A\u0026#34;, \u0026#34;D\u0026#34;, lambda x: heuristic[x])) E - Engineering (Production Applications) Scenario 1: shortest social link chain (BFS + bidirectional BFS) Background: given user A and user B, find a shortest relationship chain for explainability display.\nWhy it fits: this is an unweighted graph and objective is minimum hops; BFS matches naturally, and bidirectional BFS further reduces expansions.\nfrom collections import deque def bidirectional_bfs(graph, s, t, max_depth=6): if s == t: return 0 qa, qb = deque([s]), deque([t]) da, db = {s: 0}, {t: 0} while qa and qb: # expand smaller frontier first if len(qa) \u0026lt;= len(qb): q, dcur, dother = qa, da, db else: q, dcur, dother = qb, db, da u = q.popleft() if dcur[u] \u0026gt;= max_depth: continue for v in graph.get(u, []): if v in dcur: continue dcur[v] = dcur[u] + 1 if v in dother: return dcur[v] + dother[v] q.append(v) return -1 Scenario 2: recommendation path (Dijkstra) Background: edge weights represent \u0026ldquo;cost\u0026rdquo; (latency, risk, penalty); we need the lowest total cost path.\nWhy it fits: with non-negative weights, Dijkstra is stable and straightforward to service-ify.\npackage main import ( \u0026#34;container/heap\u0026#34; \u0026#34;fmt\u0026#34; ) type Edge struct{ To string; W float64 } type Item struct{ D float64; U string } type PQ []Item func (p PQ) Len() int { return len(p) } func (p PQ) Less(i, j int) bool { return p[i].D \u0026lt; p[j].D } func (p PQ) Swap(i, j int) { p[i], p[j] = p[j], p[i] } func (p *PQ) Push(x interface{}) { *p = append(*p, x.(Item)) } func (p *PQ) Pop() interface{} { old := *p; x := old[len(old)-1]; *p = old[:len(old)-1]; return x } func dijkstra(g map[string][]Edge, s, t string) float64 { const INF = 1e18 dist := map[string]float64{s: 0} pq := \u0026amp;PQ{{0, s}} heap.Init(pq) for pq.Len() \u0026gt; 0 { it := heap.Pop(pq).(Item) if it.D != dist[it.U] { continue } if it.U == t { return it.D } for _, e := range g[it.U] { nd := it.D + e.W if d, ok := dist[e.To]; !ok || nd \u0026lt; d { dist[e.To] = nd heap.Push(pq, Item{nd, e.To}) } } } return INF } func main() { g := map[string][]Edge{ \u0026#34;A\u0026#34;: {{\u0026#34;B\u0026#34;, 2}, {\u0026#34;C\u0026#34;, 5}}, \u0026#34;B\u0026#34;: {{\u0026#34;C\u0026#34;, 1}, {\u0026#34;D\u0026#34;, 4}}, \u0026#34;C\u0026#34;: {{\u0026#34;D\u0026#34;, 1}}, } fmt.Println(dijkstra(g, \u0026#34;A\u0026#34;, \u0026#34;D\u0026#34;)) // 4 } Scenario 3: explanation path in relationship graphs (A* + path pruning) Background: to show users \u0026ldquo;why X was recommended to Y,\u0026rdquo; we need explainable paths with controlled query latency.\nWhy it fits: A* can use domain priors (similarity distance) to cut expansions; combine with maxDepth pruning to control cost.\nfunction astar(graph, s, t, h, maxDepth = 6) { const g = new Map([[s, 0]]); const pq = [[h(s), 0, s]]; // [f, depth, node] while (pq.length) { pq.sort((a, b) =\u0026gt; a[0] - b[0]); const [f, depth, u] = pq.shift(); if (u === t) return g.get(u); if (depth \u0026gt;= maxDepth) continue; for (const [v, w] of (graph.get(u) || [])) { const ng = g.get(u) + w; if (!g.has(v) || ng \u0026lt; g.get(v)) { g.set(v, ng); pq.push([ng + h(v), depth + 1, v]); } } } return Infinity; } Optimization Essentials (Must-know) 1) Multi-source BFS Queue multiple sources at once and run one unified BFS. Useful for tasks like \u0026ldquo;distance to nearest interest point\u0026rdquo; or \u0026ldquo;batch infection radius spread.\u0026rdquo;\nfrom collections import deque def multi_source_bfs(graph, sources): q = deque(sources) dist = {s: 0 for s in sources} while q: u = q.popleft() for v in graph.get(u, []): if v not in dist: dist[v] = dist[u] + 1 q.append(v) return dist 2) Bidirectional BFS / bidirectional Dijkstra Bidirectional BFS: usually cuts search depth significantly on unweighted graphs Bidirectional Dijkstra: can reduce state expansion on non-negative weighted graphs, with higher implementation complexity 3) Path pruning (max_depth / max_cost) In online services, guarantee usable latency first, then optimize coverage:\nBFS: max_depth Dijkstra: max_cost A*: max_depth + heuristic 4) visited bitmap / bloom bitmap: exact and memory-controllable (prefer when node IDs can be mapped to contiguous integers) bloom: more space-efficient but has false positives; suitable for recall-oriented prefiltering, not for strict-optimality main decision chains R - Reflection (Deep Dive) Complexity Comparison Algorithm Time Complexity Space Complexity BFS O(V+E) O(V) Dijkstra (heap) O((V+E)logV) O(V) A* Worst case same as Dijkstra O(V) Alternatives and Trade-offs Approach Conditions Cost When to Choose Bellman-Ford Negative weights allowed O(VE) Must support negative edges Floyd-Warshall All-pairs shortest paths O(V^3) Small offline graph, full-pair queries Core trio in this article High-frequency online queries Low to medium Most online path problems in engineering Common Wrong Approaches Using BFS on weighted graphs Ignoring negative-edge checks and running Dijkstra directly Using an unreasonable heuristic in A*, causing heavy invalid expansion Marking visited too early (can miss better paths in weighted graphs) Why This Set Is the Most Practical in Engineering Covers the most common graph conditions (unweighted + non-negative weights + heuristics) Supports unified interface abstraction so business layers only care about a \u0026ldquo;path query service\u0026rdquo; Composes naturally with bidirectional search and pruning for SLA control Explanation and Principles (Why this works) You can view these three methods as one evolution line:\nBFS: expand by layers for equal edge-cost settings Dijkstra: expand by minimum known true cost for non-negative weighted settings A*: add target-oriented heuristics to Dijkstra to reduce irrelevant expansion The core difference is not coding style, but what determines expansion order:\nBFS uses level Dijkstra uses g A* uses g+h FAQ and Notes What if the graph is disconnected?\nReturn unreachable (INF or empty path). Do not force path backtracking.\nWhen should visited be finalized in Dijkstra?\nRecommended: finalize when the node is popped and confirmed as the current optimal dist.\nHow to choose h(n) in A?*\nRoad networks often use Manhattan/Euclidean distance. Recommendation graphs can use embedding-distance lower bounds. Avoid systematic overestimation.\nWhen should I use bidirectional search?\nWhen both endpoints are known and the graph is large with high branching factor, benefits are usually significant.\nBest Practices and Recommendations Validate graph conditions first (unweighted? negative weights? usable heuristic?) Make path reconstruction, pruning, and logging metrics a shared middleware layer In online services, prioritize tail-latency guarantees before maximum coverage On large graphs, prefer adjacency lists + ID compression + bitmap visited S - Summary Key Takeaways BFS, Dijkstra, and A* are the shortest-path engineering core trio, and the key is condition-based selection Use BFS for unweighted graphs, Dijkstra for non-negative weighted graphs, and A* when reliable heuristics exist Multi-source, bidirectional search, and pruning are not optional polish; they are primary tools for online performance/cost control A* performance ceiling depends on heuristic quality; weak heuristics degrade performance A unified path-service interface significantly lowers switching cost among algorithms Recommended Follow-up Reading LeetCode 127 (Word Ladder, bidirectional BFS) LeetCode 743 (Network Delay Time, Dijkstra) Classic A* paper: Hart, Nilsson, Raphael (1968) Negative-weight scenarios: Bellman-Ford / Johnson Metadata Reading time: 14-18 minutes Tags: Graph Theory, shortest path, BFS, Dijkstra, A*, bidirectional search SEO keywords: shortest path, BFS, Dijkstra, A*, bidirectional BFS, multi-source BFS Meta description: Engineering guide to the shortest-path core trio: algorithm boundaries, complexity, optimization strategies, and runnable code. Call To Action (CTA) Recommended next steps using the same template:\nRefactor your current graph-query API into a pluggable algorithm interface (switchable BFS/Dijkstra/A*) Add online metrics: expanded node count, average path length, and P95 query latency If you want, I can write the next article directly: \u0026ldquo;Engineering shortest paths on negative-weight graphs (Bellman-Ford/Johnson).\u0026rdquo;\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) # Dijkstra (non-negative weights), adjacency list import heapq from math import inf def dijkstra(graph, s, t): dist = {s: 0.0} parent = {s: s} pq = [(0.0, s)] while pq: du, u = heapq.heappop(pq) if du != dist.get(u, inf): continue if u == t: break for v, w in graph.get(u, []): nd = du + w if nd \u0026lt; dist.get(v, inf): dist[v] = nd parent[v] = u heapq.heappush(pq, (nd, v)) if t not in dist: return inf, [] path = [t] while path[-1] != s: path.append(parent[path[-1]]) path.reverse() return dist[t], path // Dijkstra O(V^2) demo for dense/small graphs (non-negative weights) #include \u0026lt;stdio.h\u0026gt; #define N 5 #define INF 1000000000 int main(void) { int g[N][N] = { {0, 2, 5, 0, 0}, {0, 0, 1, 4, 0}, {0, 0, 0, 1, 0}, {0, 0, 0, 0, 0}, {0, 0, 0, 0, 0} }; int s = 0, t = 3; int dist[N], vis[N] = {0}; for (int i = 0; i \u0026lt; N; i++) dist[i] = INF; dist[s] = 0; for (int i = 0; i \u0026lt; N; i++) { int u = -1; for (int j = 0; j \u0026lt; N; j++) if (!vis[j] \u0026amp;\u0026amp; (u == -1 || dist[j] \u0026lt; dist[u])) u = j; if (u == -1 || dist[u] == INF) break; vis[u] = 1; for (int v = 0; v \u0026lt; N; v++) { if (g[u][v] \u0026gt; 0 \u0026amp;\u0026amp; dist[v] \u0026gt; dist[u] + g[u][v]) dist[v] = dist[u] + g[u][v]; } } if (dist[t] \u0026gt;= INF) printf(\u0026#34;unreachable\\n\u0026#34;); else printf(\u0026#34;dist=%d\\n\u0026#34;, dist[t]); return 0; } #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; pair\u0026lt;long long, vector\u0026lt;int\u0026gt;\u0026gt; dijkstra(int n, vector\u0026lt;vector\u0026lt;pair\u0026lt;int,int\u0026gt;\u0026gt;\u0026gt;\u0026amp; g, int s, int t) { const long long INF = (1LL\u0026lt;\u0026lt;60); vector\u0026lt;long long\u0026gt; dist(n, INF); vector\u0026lt;int\u0026gt; parent(n, -1); priority_queue\u0026lt;pair\u0026lt;long long,int\u0026gt;, vector\u0026lt;pair\u0026lt;long long,int\u0026gt;\u0026gt;, greater\u0026lt;pair\u0026lt;long long,int\u0026gt;\u0026gt;\u0026gt; pq; dist[s] = 0; parent[s] = s; pq.push({0, s}); while (!pq.empty()) { auto [du, u] = pq.top(); pq.pop(); if (du != dist[u]) continue; if (u == t) break; for (auto [v, w] : g[u]) { long long nd = du + w; if (nd \u0026lt; dist[v]) { dist[v] = nd; parent[v] = u; pq.push({nd, v}); } } } if (dist[t] == INF) return {INF, {}}; vector\u0026lt;int\u0026gt; path; for (int x = t; x != s; x = parent[x]) path.push_back(x); path.push_back(s); reverse(path.begin(), path.end()); return {dist[t], path}; } package main import ( \u0026#34;container/heap\u0026#34; \u0026#34;fmt\u0026#34; ) type Edge struct{ To int; W int64 } type Item struct{ D int64; U int } type PQ []Item func (p PQ) Len() int { return len(p) } func (p PQ) Less(i, j int) bool { return p[i].D \u0026lt; p[j].D } func (p PQ) Swap(i, j int) { p[i], p[j] = p[j], p[i] } func (p *PQ) Push(x interface{}) { *p = append(*p, x.(Item)) } func (p *PQ) Pop() interface{} { old := *p; x := old[len(old)-1]; *p = old[:len(old)-1]; return x } func dijkstra(g [][]Edge, s, t int) int64 { const INF int64 = 1\u0026lt;\u0026lt;60 dist := make([]int64, len(g)) for i := range dist { dist[i] = INF } dist[s] = 0 pq := \u0026amp;PQ{{0, s}} heap.Init(pq) for pq.Len() \u0026gt; 0 { it := heap.Pop(pq).(Item) if it.D != dist[it.U] { continue } if it.U == t { return it.D } for _, e := range g[it.U] { nd := it.D + e.W if nd \u0026lt; dist[e.To] { dist[e.To] = nd heap.Push(pq, Item{nd, e.To}) } } } return INF } func main() { g := make([][]Edge, 4) g[0] = []Edge{{1, 2}, {2, 5}} g[1] = []Edge{{2, 1}, {3, 4}} g[2] = []Edge{{3, 1}} fmt.Println(dijkstra(g, 0, 3)) // 4 } use std::cmp::Reverse; use std::collections::BinaryHeap; fn dijkstra(graph: \u0026amp;Vec\u0026lt;Vec\u0026lt;(usize, i64)\u0026gt;\u0026gt;, s: usize, t: usize) -\u0026gt; i64 { let inf = i64::MAX / 4; let mut dist = vec![inf; graph.len()]; let mut pq = BinaryHeap::new(); dist[s] = 0; pq.push((Reverse(0_i64), s)); while let Some((Reverse(du), u)) = pq.pop() { if du != dist[u] { continue; } if u == t { return du; } for \u0026amp;(v, w) in \u0026amp;graph[u] { let nd = du + w; if nd \u0026lt; dist[v] { dist[v] = nd; pq.push((Reverse(nd), v)); } } } inf } fn main() { let g = vec![ vec![(1,2),(2,5)], vec![(2,1),(3,4)], vec![(3,1)], vec![] ]; println!(\u0026#34;{}\u0026#34;, dijkstra(\u0026amp;g, 0, 3)); // 4 } // BFS shortest hops in unweighted graph function bfsShortest(graph, s, t) { const q = [[s, 0]]; const seen = new Set([s]); while (q.length) { const [u, d] = q.shift(); if (u === t) return d; for (const v of (graph.get(u) || [])) { if (!seen.has(v)) { seen.add(v); q.push([v, d + 1]); } } } return Infinity; } const g = new Map([ [\u0026#34;A\u0026#34;, [\u0026#34;B\u0026#34;, \u0026#34;C\u0026#34;]], [\u0026#34;B\u0026#34;, [\u0026#34;D\u0026#34;]], [\u0026#34;C\u0026#34;, [\u0026#34;D\u0026#34;]], [\u0026#34;D\u0026#34;, []], ]); console.log(bfsShortest(g, \u0026#34;A\u0026#34;, \u0026#34;D\u0026#34;)); // 2 ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/20-shortest-path-bfs-dijkstra-astar-acers/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nShortest path is not one question. It is an engineering skill set: choose the right algorithm by graph conditions. This ACERS article breaks down \u003cem\u003e\u003cem\u003eBFS (unweighted) / Dijkstra (non-negative weights) / A\u003c/em\u003e (heuristic)\u003c/em\u003e* and gives optimization templates you actually use in relationship graphs, recommendation paths, and path explanations.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 14-18 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eGraph Theory\u003c/code\u003e, \u003ccode\u003eshortest path\u003c/code\u003e, \u003ccode\u003eBFS\u003c/code\u003e, \u003ccode\u003eDijkstra\u003c/code\u003e, \u003ccode\u003eA*\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: shortest path, BFS, Dijkstra, A*, bidirectional search, multi-source BFS\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Engineering guide to the shortest-path core trio: algorithm boundaries, complexity, runnable code, optimization strategies, and practical scenarios.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLearners reinforcing graph fundamentals who want reusable engineering templates\u003c/li\u003e\n\u003cli\u003eBackend/algorithm engineers working on social links, recommendation paths, or graph-query explanations\u003c/li\u003e\n\u003cli\u003eDevelopers who know BFS, Dijkstra, and A* by name but still struggle with robust selection and optimization\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eShortest-path problems are common in:\u003c/p\u003e","title":"Shortest Path Core Trio: BFS, Dijkstra, and A* ACERS Engineering Breakdown"},{"content":" Subtitle / Abstract\nBFS / DFS are not just about \u0026ldquo;being able to code them.\u0026rdquo; You need production-ready behavior, predictable cost, and provable correctness. Following the ACERS structure, this article breaks three common tasks (k-hop query, subgraph extraction, and path existence) into reusable templates: iterative implementation + early stop + visited structure selection.\nEstimated reading time: 12-16 minutes Tags: Graph, BFS, DFS, k-hop, subgraph extraction SEO keywords: BFS, DFS, k-hop query, subgraph extraction, path existence, visited bitmap, bloom filter Meta description: BFS/DFS for engineering scenarios: iterative implementations to avoid stack overflow, early stop to cut search cost, and visited bitmap/bloom to optimize memory and dedup performance. Target Audience Engineers working on graph databases, risk-control relationship graphs, or call-chain analysis Learners who can write \u0026ldquo;problem-solution style BFS/DFS\u0026rdquo; but do not yet have engineering templates Developers who want traversal code that is stable, observable, and extensible Background / Motivation In production systems, BFS/DFS is usually not a one-off offline script. It is part of an online request path:\nk-hop neighborhood queries need latency control Subgraph extraction needs memory and output-size control Path existence checks need fast true/false responses If you stop at textbook recursive templates, you quickly hit issues:\nDeep graphs cause recursive stack overflow No pruning causes unnecessary expansion The wrong visited structure hurts both memory and throughput So this article has one focus: upgrade BFS/DFS to a level where you can use them fluently in production.\nCore Concepts Concept Purpose Engineering Focus BFS (queue) Layer-by-layer expansion, natural support for hop levels Good for k-hop, minimum edge count, layered subgraphs DFS (stack) Deep exploration, efficient for path existence checks Good for fast reachability decisions and depth-based pruning early stop Stop search as soon as conditions are met Controls P99 latency and resource usage visited bitmap Exact dedup with compact memory Requires node ID compression first bloom filter Probabilistic dedup / prefilter Has false positives; cannot be used alone in strict-correctness tasks A - Algorithm (Problem and Algorithm) Problem Restatement (LeetCode-style training problem) Given an unweighted graph G (adjacency list), a start node s, a maximum hop count K, and an optional target node t:\nReturn the set of nodes reachable from s within K hops (k-hop query) Return the subgraph formed by visited nodes and edges (subgraph extraction) Determine whether a path s -\u0026gt; t exists (path existence) Requirements:\nUse iterative BFS/DFS (no recursion) Support early stop (for example: beyond K hops, target hit, or business predicate hit) Maintain visited state to avoid repeated expansion Input and Output Name Type Description graph List[List[int]] Adjacency list, node IDs are 0..n-1 s int Start node K int Maximum hop count (for BFS) t int Target node (for reachability) Return 1 Set[int] Nodes reachable within K hops Return 2 List[Tuple[int,int]] Extracted edge set (optional) Return 3 bool Reachable or not Example 1: k-hop query graph = [ [1,2], # 0 [3], # 1 [3,4], # 2 [5], # 3 [], # 4 [] # 5 ] s = 0, K = 2 output nodes: {0,1,2,3,4} Explanation: within 2 hops, you can reach 0 (0 hop), 1/2 (1 hop), 3/4 (2 hops). Node 5 needs 3 hops.\nExample 2: path existence same graph as above s = 0, t = 5 output: true Derivation (From naive approach to engineering template) Naive version: recursive DFS / BFS without pruning Recursive DFS can hit stack-depth limits on deep graphs BFS without hop limits may scan the full graph Without visited state, expansions can repeat exponentially Key Observations In practice, you usually want not \u0026ldquo;full-graph traversal\u0026rdquo; but the minimal traversal that satisfies business constraints Search order can be templated (queue/BFS, stack/DFS), but pruning must be business-aware visited is not one fixed implementation. You must choose based on graph scale and correctness needs Method Selection k-hop: prefer BFS (naturally layered) Path existence: prefer iterative DFS (stack + early stop) Large graphs: ID compression + bitmap; for high-throughput, weak-consistency dedup, add bloom prefiltering C - Concepts (Core Ideas) Method Categories Graph Traversal Layered Search (BFS) Depth Search (DFS) Pruned Search Engineering Invariants visited[u] = true means node u has been queued/stacked (or consumed, depending on policy) In BFS, (node, depth).depth never exceeds K After an early-stop condition triggers, returned results still satisfy business-defined correctness Early-Stop Design Template hop limit: stop expanding neighbors when depth == K target hit: return immediately when node == t budget control: stop when visited-node count exceeds threshold and return partial results predicate pruning: skip expansion when node attributes do not satisfy business rules visited Structure Selection Structure Correctness Memory Speed Suitable Scenarios HashSet Exact Medium-high Fast Sparse node IDs, dynamic IDs Bitmap Exact Lowest (bit-level) Fast Node IDs can be compressed to contiguous integers Bloom Filter Approximate (false positives) Very low Fast Prefiltering and dedup acceleration (error-tolerant) Key conclusions:\nStrict correctness tasks (for example permission checks, risk-control hits) cannot rely on bloom alone The safest bloom usage is \u0026ldquo;prefilter + exact-structure confirmation\u0026rdquo; Practical Guide / Steps Normalize node IDs first (compress to 0..n-1 when needed) For k-hop, use BFS with depth in queue entries For path existence, use iterative DFS with a stack of pending nodes Apply early-stop checks at the top of each loop Prefer bitmap for visited (when compressible), otherwise HashSet If dedup checks are the throughput bottleneck, add bloom prefiltering Runnable Python example (python3 bfs_dfs_demo.py):\nfrom collections import deque from typing import List, Set class SimpleBloom: \u0026#34;\u0026#34;\u0026#34;Demo Bloom filter: prefilter only, not a standalone correctness guarantee.\u0026#34;\u0026#34;\u0026#34; def __init__(self, m: int = 1 \u0026lt;\u0026lt; 15): self.m = m self.bits = bytearray(m // 8 + 1) def _idx(self, x: int, salt: int) -\u0026gt; int: return hash((x, salt)) \u0026amp; (self.m - 1) def _set(self, i: int) -\u0026gt; None: self.bits[i \u0026gt;\u0026gt; 3] |= 1 \u0026lt;\u0026lt; (i \u0026amp; 7) def _get(self, i: int) -\u0026gt; bool: return (self.bits[i \u0026gt;\u0026gt; 3] \u0026gt;\u0026gt; (i \u0026amp; 7)) \u0026amp; 1 == 1 def add(self, x: int) -\u0026gt; None: for salt in (17, 31, 73): self._set(self._idx(x, salt)) def maybe_contains(self, x: int) -\u0026gt; bool: return all(self._get(self._idx(x, salt)) for salt in (17, 31, 73)) def bfs_k_hop(graph: List[List[int]], s: int, k: int) -\u0026gt; Set[int]: n = len(graph) visited = bytearray(n) # bitmap q = deque([(s, 0)]) visited[s] = 1 result = {s} while q: u, d = q.popleft() if d == k: continue for v in graph[u]: if not visited[v]: visited[v] = 1 result.add(v) q.append((v, d + 1)) return result def dfs_path_exists(graph: List[List[int]], s: int, t: int) -\u0026gt; bool: n = len(graph) visited = bytearray(n) stack = [s] visited[s] = 1 while stack: u = stack.pop() if u == t: # early stop return True for v in graph[u]: if not visited[v]: visited[v] = 1 stack.append(v) return False def bfs_with_bloom_prefilter(graph: List[List[int]], s: int, limit: int = 100000) -\u0026gt; int: \u0026#34;\u0026#34;\u0026#34;Example: bloom reduces set lookups; exact set still guarantees correctness.\u0026#34;\u0026#34;\u0026#34; q = deque([s]) exact = {s} bloom = SimpleBloom() bloom.add(s) visited_count = 0 while q and visited_count \u0026lt; limit: u = q.popleft() visited_count += 1 for v in graph[u]: # bloom says \u0026#34;not seen\u0026#34; =\u0026gt; definitely unseen, enqueue directly if not bloom.maybe_contains(v): bloom.add(v) exact.add(v) q.append(v) continue # bloom says \u0026#34;maybe seen\u0026#34; =\u0026gt; confirm with exact set if v not in exact: exact.add(v) q.append(v) return visited_count if __name__ == \u0026#34;__main__\u0026#34;: graph = [ [1, 2], # 0 [3], # 1 [3, 4], # 2 [5], # 3 [], # 4 [], # 5 ] print(\u0026#34;k-hop\u0026lt;=2:\u0026#34;, sorted(bfs_k_hop(graph, 0, 2))) print(\u0026#34;path 0-\u0026gt;5:\u0026#34;, dfs_path_exists(graph, 0, 5)) print(\u0026#34;bloom+exact visits:\u0026#34;, bfs_with_bloom_prefilter(graph, 0, limit=100)) E - Engineering (Production Applications) Scenario 1: k-hop neighborhood query in graph databases (Python) Background: users provide seed nodes, and the system returns neighborhood nodes within N hops.\nWhy it fits: BFS is naturally layered, and depth directly maps to k-hop business semantics.\nfrom collections import deque def k_hop_nodes(graph, s, k): q = deque([(s, 0)]) vis = {s} out = {s} while q: u, d = q.popleft() if d == k: continue for v in graph[u]: if v not in vis: vis.add(v) out.add(v) q.append((v, d + 1)) return out Scenario 2: call-chain fault tracing (Go) Background: determine whether service A can reach faulty service B in a call graph.\nWhy it fits: iterative DFS with target-hit early stop often returns faster than full-graph scans.\npackage main import \u0026#34;fmt\u0026#34; func pathExists(graph [][]int, s, t int) bool { vis := make([]bool, len(graph)) stack := []int{s} vis[s] = true for len(stack) \u0026gt; 0 { u := stack[len(stack)-1] stack = stack[:len(stack)-1] if u == t { return true } for _, v := range graph[u] { if !vis[v] { vis[v] = true stack = append(stack, v) } } } return false } func main() { g := [][]int{{1, 2}, {3}, {3, 4}, {5}, {}, {}} fmt.Println(pathExists(g, 0, 5)) // true } Scenario 3: online dedup prefiltering in relationship graphs (C++) Background: under high QPS, visited-set lookups become a hotspot.\nWhy it fits: use bloom for fast \u0026ldquo;possibly unseen\u0026rdquo; routing, then confirm with exact bitmap/set to reduce average dedup cost.\n#include \u0026lt;bitset\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;unordered_set\u0026gt; struct Bloom { static const int M = 1 \u0026lt;\u0026lt; 16; std::bitset\u0026lt;M\u0026gt; bits; int h1(int x) const { return (x * 1315423911u) \u0026amp; (M - 1); } int h2(int x) const { return (x * 2654435761u) \u0026amp; (M - 1); } void add(int x) { bits.set(h1(x)); bits.set(h2(x)); } bool maybe(int x) const { return bits.test(h1(x)) \u0026amp;\u0026amp; bits.test(h2(x)); } }; int main() { Bloom b; std::unordered_set\u0026lt;int\u0026gt; exact; for (int x : {1, 2, 3}) { b.add(x); exact.insert(x); } int q = 4; if (!b.maybe(q) || exact.find(q) != exact.end()) { std::cout \u0026lt;\u0026lt; \u0026#34;not visited yet\\n\u0026#34;; } } R - Reflection (Deep Dive) Complexity Analysis Let V' and E' be the node and edge counts of the visited subgraph:\nBFS / DFS time complexity: O(V' + E') Extra space for visited: HashSet: O(V') Bitmap: O(N) bits (N is the full-graph node upper bound) Bloom: O(m) bits (m is bit-array size, tunable approximation) For k-hop tasks, V' and E' are often much smaller than full-graph size, which is where early stop provides most value.\nAlternatives and Trade-offs Approach Pros Cons Best For Recursive DFS Short code Stack risk on deep graphs, weaker controllability Small offline scripts Iterative DFS Controllable, easy to add early stop Manual stack management Path existence / online checks BFS Clear layering, suitable for hop constraints Peak memory may be higher than DFS k-hop / layered retrieval Bidirectional BFS Faster point-to-point path queries Higher implementation complexity Sparse graph, single-source to single-target Common Wrong Approaches Mark visited only at dequeue time: can cause repeated enqueues and queue blowup Use bloom alone as visited: false positives can skip nodes that should be visited No budget limits: online requests can suffer long-tail latency at high-degree nodes Why This Is the Most Practical Engineering Strategy Iterative implementation avoids recursion risks early stop constrains search cost inside business boundaries bitmap/bloom make visited strategy flexible by graph scale FAQ and Notes Which is faster, BFS or DFS?\nThere is no absolute winner. BFS is common for k-hop. For reachability where the target may appear deep, DFS often hits faster.\nCan bloom false positives affect correctness?\nYes. If bloom is used alone for dedup, false positives can skip valid search branches. Strict-correctness tasks must use exact confirmation.\nWhen should visited be marked?\nUsually at enqueue/push time, so the same node is not inserted repeatedly.\nBest Practices and Recommendations Define business stop conditions before writing traversal code Default to iterative versions; use recursion only for small offline tools Prefer bitmap when node IDs can be compressed, balancing speed and memory Use bloom only as a prefilter, not as a standalone correctness guarantee Add visit caps and latency monitoring to traversal to avoid online cascades S - Summary Key Takeaways Engineering-grade BFS/DFS is about iterative containers, clear invariants, and explicit early-stop conditions Prefer BFS for k-hop queries and subgraph extraction; prefer iterative DFS for path existence visited has no universal answer: HashSet, bitmap, and bloom each have boundaries Bloom has false positives; use it as \u0026ldquo;prefilter + exact confirmation,\u0026rdquo; not as a standalone strict decision source Make search budgets (hop count, node budget, time budget) explicit parameters for stable production behavior Recommended Follow-up Reading LeetCode 200 (Number of Islands): graph traversal templates LeetCode 127 (Word Ladder): BFS + pruning Graph500 / graph computing benchmarks: ideas for large-scale traversal performance Classic Bloom-filter papers and engineering parameter sizing (false-positive rate vs bit-array size) Metadata Reading time: 12-16 minutes Tags: Graph, BFS, DFS, k-hop, subgraph extraction SEO keywords: BFS, DFS, k-hop query, path existence, visited bitmap, bloom filter Meta description: Engineering BFS/DFS templates with iterative implementation, early stop, visited bitmap/bloom selection, and runnable multi-language code. Call To Action (CTA) I recommend locking in two actions immediately:\nRefactor one online graph-query API to expose explicit early-stop parameters (hop budget, node budget, time budget) Benchmark HashSet vs bitmap on real data (add bloom prefilter if needed), and record throughput and memory curves If you want, I can write the next post: \u0026ldquo;Union-Find + BFS/DFS selection checklist for graph problems (when to traverse vs when to merge).\u0026rdquo;\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from collections import deque def bfs_k_hop(graph, s, k): vis = [False] * len(graph) q = deque([(s, 0)]) vis[s] = True out = {s} while q: u, d = q.popleft() if d == k: continue for v in graph[u]: if not vis[v]: vis[v] = True out.add(v) q.append((v, d + 1)) return out def dfs_path_exists(graph, s, t): vis = [False] * len(graph) st = [s] vis[s] = True while st: u = st.pop() if u == t: return True for v in graph[u]: if not vis[v]: vis[v] = True st.append(v) return False if __name__ == \u0026#34;__main__\u0026#34;: g = [[1, 2], [3], [3, 4], [5], [], []] print(sorted(bfs_k_hop(g, 0, 2))) print(dfs_path_exists(g, 0, 5)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdbool.h\u0026gt; #define N 6 void bfs_k_hop(int g[N][N], int s, int k) { int q[128][2], head = 0, tail = 0; bool vis[N] = {0}; vis[s] = true; q[tail][0] = s; q[tail][1] = 0; tail++; while (head \u0026lt; tail) { int u = q[head][0], d = q[head][1]; head++; if (d == k) continue; for (int v = 0; v \u0026lt; N; ++v) { if (g[u][v] \u0026amp;\u0026amp; !vis[v]) { vis[v] = true; q[tail][0] = v; q[tail][1] = d + 1; tail++; } } } for (int i = 0; i \u0026lt; N; ++i) if (vis[i]) printf(\u0026#34;%d \u0026#34;, i); printf(\u0026#34;\\n\u0026#34;); } bool dfs_path_exists(int g[N][N], int s, int t) { int st[128], top = 0; bool vis[N] = {0}; st[top++] = s; vis[s] = true; while (top) { int u = st[--top]; if (u == t) return true; for (int v = 0; v \u0026lt; N; ++v) { if (g[u][v] \u0026amp;\u0026amp; !vis[v]) { vis[v] = true; st[top++] = v; } } } return false; } int main(void) { int g[N][N] = {0}; g[0][1] = g[0][2] = 1; g[1][3] = 1; g[2][3] = g[2][4] = 1; g[3][5] = 1; bfs_k_hop(g, 0, 2); // 0 1 2 3 4 printf(\u0026#34;%d\\n\u0026#34;, dfs_path_exists(g, 0, 5)); // 1 return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;queue\u0026gt; #include \u0026lt;vector\u0026gt; std::vector\u0026lt;int\u0026gt; bfsKHop(const std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt;\u0026amp; g, int s, int k) { std::vector\u0026lt;char\u0026gt; vis(g.size(), 0); std::queue\u0026lt;std::pair\u0026lt;int, int\u0026gt;\u0026gt; q; vis[s] = 1; q.push({s, 0}); while (!q.empty()) { auto [u, d] = q.front(); q.pop(); if (d == k) continue; for (int v : g[u]) { if (!vis[v]) { vis[v] = 1; q.push({v, d + 1}); } } } std::vector\u0026lt;int\u0026gt; out; for (int i = 0; i \u0026lt; (int)g.size(); ++i) if (vis[i]) out.push_back(i); return out; } bool dfsPathExists(const std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt;\u0026amp; g, int s, int t) { std::vector\u0026lt;char\u0026gt; vis(g.size(), 0); std::vector\u0026lt;int\u0026gt; st = {s}; vis[s] = 1; while (!st.empty()) { int u = st.back(); st.pop_back(); if (u == t) return true; for (int v : g[u]) { if (!vis[v]) { vis[v] = 1; st.push_back(v); } } } return false; } int main() { std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt; g = {{1,2},{3},{3,4},{5},{},{}}; auto nodes = bfsKHop(g, 0, 2); for (int x : nodes) std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34; \u0026lt;\u0026lt; dfsPathExists(g, 0, 5) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } package main import \u0026#34;fmt\u0026#34; func bfsKHop(graph [][]int, s, k int) []bool { vis := make([]bool, len(graph)) type Node struct{ u, d int } q := []Node{{s, 0}} vis[s] = true for head := 0; head \u0026lt; len(q); head++ { cur := q[head] if cur.d == k { continue } for _, v := range graph[cur.u] { if !vis[v] { vis[v] = true q = append(q, Node{v, cur.d + 1}) } } } return vis } func dfsPathExists(graph [][]int, s, t int) bool { vis := make([]bool, len(graph)) stack := []int{s} vis[s] = true for len(stack) \u0026gt; 0 { u := stack[len(stack)-1] stack = stack[:len(stack)-1] if u == t { return true } for _, v := range graph[u] { if !vis[v] { vis[v] = true stack = append(stack, v) } } } return false } func main() { g := [][]int{{1, 2}, {3}, {3, 4}, {5}, {}, {}} fmt.Println(bfsKHop(g, 0, 2)) fmt.Println(dfsPathExists(g, 0, 5)) } use std::collections::VecDeque; fn bfs_k_hop(graph: \u0026amp;Vec\u0026lt;Vec\u0026lt;usize\u0026gt;\u0026gt;, s: usize, k: usize) -\u0026gt; Vec\u0026lt;bool\u0026gt; { let mut vis = vec![false; graph.len()]; let mut q: VecDeque\u0026lt;(usize, usize)\u0026gt; = VecDeque::new(); vis[s] = true; q.push_back((s, 0)); while let Some((u, d)) = q.pop_front() { if d == k { continue; } for \u0026amp;v in \u0026amp;graph[u] { if !vis[v] { vis[v] = true; q.push_back((v, d + 1)); } } } vis } fn dfs_path_exists(graph: \u0026amp;Vec\u0026lt;Vec\u0026lt;usize\u0026gt;\u0026gt;, s: usize, t: usize) -\u0026gt; bool { let mut vis = vec![false; graph.len()]; let mut st = vec![s]; vis[s] = true; while let Some(u) = st.pop() { if u == t { return true; } for \u0026amp;v in \u0026amp;graph[u] { if !vis[v] { vis[v] = true; st.push(v); } } } false } fn main() { let graph = vec![vec![1, 2], vec![3], vec![3, 4], vec![5], vec![], vec![]]; println!(\u0026#34;{:?}\u0026#34;, bfs_k_hop(\u0026amp;graph, 0, 2)); println!(\u0026#34;{}\u0026#34;, dfs_path_exists(\u0026amp;graph, 0, 5)); } function bfsKHop(graph, s, k) { const vis = Array(graph.length).fill(false); const q = [[s, 0]]; let head = 0; vis[s] = true; while (head \u0026lt; q.length) { const [u, d] = q[head++]; if (d === k) continue; for (const v of graph[u]) { if (!vis[v]) { vis[v] = true; q.push([v, d + 1]); } } } return vis; } function dfsPathExists(graph, s, t) { const vis = Array(graph.length).fill(false); const st = [s]; vis[s] = true; while (st.length) { const u = st.pop(); if (u === t) return true; for (const v of graph[u]) { if (!vis[v]) { vis[v] = true; st.push(v); } } } return false; } const g = [[1, 2], [3], [3, 4], [5], [], []]; console.log(bfsKHop(g, 0, 2)); console.log(dfsPathExists(g, 0, 5)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/algorithm/graph/10-bfs-dfs-k-hop-subgraph-path-existence/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\u003cbr\u003e\nBFS / DFS are not just about \u0026ldquo;being able to code them.\u0026rdquo; You need production-ready behavior, predictable cost, and provable correctness. Following the ACERS structure, this article breaks three common tasks (k-hop query, subgraph extraction, and path existence) into reusable templates: \u003cstrong\u003eiterative implementation + early stop + visited structure selection\u003c/strong\u003e.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 12-16 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eGraph\u003c/code\u003e, \u003ccode\u003eBFS\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e, \u003ccode\u003ek-hop\u003c/code\u003e, \u003ccode\u003esubgraph extraction\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: BFS, DFS, k-hop query, subgraph extraction, path existence, visited bitmap, bloom filter\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: BFS/DFS for engineering scenarios: iterative implementations to avoid stack overflow, early stop to cut search cost, and visited bitmap/bloom to optimize memory and dedup performance.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-audience\"\u003eTarget Audience\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eEngineers working on graph databases, risk-control relationship graphs, or call-chain analysis\u003c/li\u003e\n\u003cli\u003eLearners who can write \u0026ldquo;problem-solution style BFS/DFS\u0026rdquo; but do not yet have engineering templates\u003c/li\u003e\n\u003cli\u003eDevelopers who want traversal code that is stable, observable, and extensible\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eIn production systems, BFS/DFS is usually not a one-off offline script. It is part of an online request path:\u003c/p\u003e","title":"BFS / DFS Engineering Primer: k-hop Queries, Subgraph Extraction, and Path Existence ACERS Breakdown"},{"content":" Subtitle / Summary\nThe key is not comparing values, but comparing node identity (same object / same address). This ACERS guide explains the naive hash approach, the length-alignment approach, and the most practical switch-head two-pointer template, with runnable multi-language implementations under the no-modification and no-cycle constraints.\nReading time: 10-14 min Tags: Hot100, linked list, two pointers SEO keywords: Intersection of Two Linked Lists, switch heads, O(1) space, LeetCode 160 Meta description: Two pointers walk A then B and B then A, guaranteeing meeting at the intersection or both reaching null within m+n steps, with O(m+n) time and O(1) space. Target Readers Hot100 learners who want a reusable linked-list two-pointer template Developers who often confuse \u0026ldquo;same value\u0026rdquo; with \u0026ldquo;same node\u0026rdquo; Engineers working with shared tail structures in chain-like data Background / Motivation This problem looks simple, but it forces you to separate three concepts:\nIntersection means sharing the exact same node object, not equal values You cannot modify structure (no rewriting next, no marking nodes) You still need linear performance The most practical solution is the switch-head two-pointer method. It needs no hash set and no precomputed lengths, yet synchronizes both pointers in at most m+n steps.\nCore Concepts Concept Meaning Note Same node Two pointers reference the same memory object Pointer/reference equality Shared suffix Two lists share all nodes from some node onward After intersection, tails are identical Switch-head two pointers At list end, jump to the other list head Equalizes total traveled distance No-cycle assumption Problem guarantees no cycle in the structure Otherwise cycle handling is required A - Algorithm (Problem and Algorithm) Problem Restatement Given heads headA and headB of two singly linked lists, return the node where they intersect. If they do not intersect, return null.\nConstraints:\nThe linked structure has no cycle The original list structure must remain unchanged Input / Output Name Type Description headA ListNode Head of list A headB ListNode Head of list B return ListNode / null Intersection start node (same object), or null Example 1 (Intersecting) A: a1 -\u0026gt; a2 -\u0026gt; c1 -\u0026gt; c2 -\u0026gt; c3 B: b1 -\u0026gt; b2 -\u0026gt; b3 -\u0026gt; c1 -\u0026gt; c2 -\u0026gt; c3 output: c1 (return node reference, not value) Example 2 (No intersection) A: 1 -\u0026gt; 2 -\u0026gt; 3 B: 4 -\u0026gt; 5 output: null Thought Process: From Hash to O(1) Template Naive approach: hash all nodes in A Traverse A and put each node address into a hash set Traverse B and return the first node found in the set Pros: direct and easy to implement. Cons: O(m) extra space.\nO(1) approach #1: length alignment Compute lengths m and n Advance the longer list by abs(m-n) Move both pointers together until they meet This is O(1) space, but requires separate length passes.\nO(1) approach #2 (most practical): switch-head two pointers Initialize pA=headA, pB=headB:\nMove one step each round If a pointer reaches null, redirect it to the other list head Intuition: pA walks path A + B, pB walks path B + A. Both paths have equal total length, so they synchronize at the intersection, or both become null.\nC - Concepts (Core Ideas) Method category Two pointers on linked list Implicit length alignment through \u0026ldquo;walk full A then B\u0026rdquo; Identity equality without structural mutation Why switch-head pointers must meet Let:\na: unique prefix length of A b: unique prefix length of B c: shared suffix length Then:\nLength of A is a + c Length of B is b + c Pointer travel:\npA walks a+c, then b pB walks b+c, then a Both walk a+b+c before entering the same alignment point. So within m+n steps, they either meet at the intersection or both reach null.\nPractice Guide / Steps Initialize pA=headA, pB=headB Loop while pA != pB: pA = pA.next else headB pB = pB.next else headA Return pA (intersection node or null) Runnable Python example (intersection.py):\nfrom __future__ import annotations class ListNode: def __init__(self, val: int): self.val = val self.next: ListNode | None = None def get_intersection_node(head_a: ListNode | None, head_b: ListNode | None) -\u0026gt; ListNode | None: p, q = head_a, head_b while p is not q: p = p.next if p else head_b q = q.next if q else head_a return p if __name__ == \u0026#34;__main__\u0026#34;: # Build shared tail: c1 -\u0026gt; c2 -\u0026gt; c3 c1 = ListNode(8) c2 = ListNode(4) c3 = ListNode(5) c1.next = c2 c2.next = c3 # A: a1 -\u0026gt; a2 -\u0026gt; c1 a1 = ListNode(4) a2 = ListNode(1) a1.next = a2 a2.next = c1 # B: b1 -\u0026gt; b2 -\u0026gt; b3 -\u0026gt; c1 b1 = ListNode(5) b2 = ListNode(6) b3 = ListNode(1) b1.next = b2 b2.next = b3 b3.next = c1 ans = get_intersection_node(a1, b1) print(ans.val if ans else None) # 8 E - Engineering (Real-world Scenarios) Scenario 1: Shared suffix dedup in versioned pipelines (Python) Background: Some experiment/task pipelines are chain nodes, and multiple pipelines can share a common tail. Why it fits: locating the intersection lets you execute shared tail once or cache it.\nclass Step: def __init__(self, name): self.name = name self.next = None def intersection(a, b): p, q = a, b while p is not q: p = p.next if p else b q = q.next if q else a return p if __name__ == \u0026#34;__main__\u0026#34;: common = Step(\u0026#34;train\u0026#34;) common.next = Step(\u0026#34;evaluate\u0026#34;) a = Step(\u0026#34;clean\u0026#34;) a.next = Step(\u0026#34;fe\u0026#34;) a.next.next = common b = Step(\u0026#34;clean_v2\u0026#34;) b.next = common hit = intersection(a, b) print(hit.name if hit else \u0026#34;none\u0026#34;) # train Scenario 2: Safety check to avoid double free (C) Background: In C projects, accidentally shared list tails can cause double free if both lists are freed independently. Why it fits: detect intersection first, then free shared suffix only once.\nstruct Node { int v; struct Node* next; }; struct Node* intersection(struct Node* a, struct Node* b) { struct Node* p = a; struct Node* q = b; while (p != q) { p = p ? p-\u0026gt;next : b; q = q ? q-\u0026gt;next : a; } return p; // may be NULL } Scenario 3: Merge point detection in frontend history branches (JavaScript) Background: Some editors represent operation history as linked nodes; branches can share a common tail after merge/replay. Why it fits: finding intersection gives \u0026ldquo;where shared history starts\u0026rdquo; for UI highlight and merge strategy.\nfunction intersection(headA, headB) { let p = headA; let q = headB; while (p !== q) { p = p ? p.next : headB; q = q ? q.next : headA; } return p; } R - Reflection (Tradeoffs and Deepening) Complexity Time: O(m+n) Space: O(1) Alternatives Method Idea Time Extra space Note Hash set Store nodes of A, scan B O(m+n) O(m) Most direct Length alignment Align by length difference O(m+n) O(1) Needs separate length pass Switch-head two pointers A-\u0026gt;B and B-\u0026gt;A traversal O(m+n) O(1) Cleanest template Common pitfalls Using value equality as intersection: intersection requires node identity equality. Null handling mistakes: loop should end by pointer identity; result can be null. Using template on cyclic lists without checks: this problem guarantees acyclic lists; otherwise loop risk exists. FAQs and Notes Why no infinite loop? Under no-cycle assumption, each pointer walks at most m+n steps before meeting at intersection or both becoming null.\nWhat if headA == headB? They are already equal at start; return immediately.\nCan I mark nodes or modify values? No. Problem requires preserving original structure, and structural mutation is unsafe in shared data.\nBest Practices Memorize the switch-head pattern: p = p ? p-\u0026gt;next : headB, q = q ? q-\u0026gt;next : headA For any \u0026ldquo;shared tail\u0026rdquo; question, first verify whether equality means identity or value If cycles are possible in production data, run cycle detection first S - Summary Key Takeaways Intersection in this problem means same node object, not same value Hash is easy but costs memory; length alignment is O(1) but needs explicit length pass Switch-head two pointers implicitly aligns path lengths, giving O(m+n) time and O(1) space without mutations The no-cycle guarantee is a core precondition for termination and correctness This template transfers to shared-tail / merge-point / common-suffix structures References and Further Reading LeetCode 160. Intersection of Two Linked Lists Classic linked-list pointer patterns: cycle detection, middle node, remove Nth from end Pointer identity concepts in shared mutable structures Meta Reading time: 10-14 min Tags: Hot100, linked list, two pointers, space optimization SEO keywords: Intersection of Two Linked Lists, switch heads, O(1) space, LeetCode 160 Meta description: Two pointers walk A then B and B then A, meeting at intersection or null within m+n steps, with O(m+n) time and O(1) space. Call to Action Use the same thinking on two follow-up problems:\nLinked List Cycle (Floyd) Remove Nth Node From End (fast/slow pointers) If you want, I can also add an advanced follow-up post: how to reason about intersection when cycles may exist.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from __future__ import annotations class ListNode: def __init__(self, x: int): self.val = x self.next: ListNode | None = None def get_intersection_node(head_a: ListNode | None, head_b: ListNode | None) -\u0026gt; ListNode | None: p, q = head_a, head_b while p is not q: p = p.next if p else head_b q = q.next if q else head_a return p #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; struct ListNode { int val; struct ListNode* next; }; struct ListNode* getIntersectionNode(struct ListNode* headA, struct ListNode* headB) { struct ListNode* p = headA; struct ListNode* q = headB; while (p != q) { p = p ? p-\u0026gt;next : headB; q = q ? q-\u0026gt;next : headA; } return p; } static struct ListNode* node(int v) { struct ListNode* n = (struct ListNode*)malloc(sizeof(struct ListNode)); n-\u0026gt;val = v; n-\u0026gt;next = NULL; return n; } int main(void) { // shared: c1(8) -\u0026gt; c2(4) -\u0026gt; c3(5) struct ListNode* c1 = node(8); struct ListNode* c2 = node(4); struct ListNode* c3 = node(5); c1-\u0026gt;next = c2; c2-\u0026gt;next = c3; // A: 4 -\u0026gt; 1 -\u0026gt; c1 struct ListNode* a1 = node(4); struct ListNode* a2 = node(1); a1-\u0026gt;next = a2; a2-\u0026gt;next = c1; // B: 5 -\u0026gt; 6 -\u0026gt; 1 -\u0026gt; c1 struct ListNode* b1 = node(5); struct ListNode* b2 = node(6); struct ListNode* b3 = node(1); b1-\u0026gt;next = b2; b2-\u0026gt;next = b3; b3-\u0026gt;next = c1; struct ListNode* ans = getIntersectionNode(a1, b1); if (ans) printf(\u0026#34;%d\\n\u0026#34;, ans-\u0026gt;val); else printf(\u0026#34;null\\n\u0026#34;); // In real code, free nodes carefully: shared suffix should be freed once. return 0; } #include \u0026lt;iostream\u0026gt; struct ListNode { int val; ListNode* next; explicit ListNode(int x) : val(x), next(nullptr) {} }; ListNode* getIntersectionNode(ListNode* headA, ListNode* headB) { ListNode* p = headA; ListNode* q = headB; while (p != q) { p = p ? p-\u0026gt;next : headB; q = q ? q-\u0026gt;next : headA; } return p; } int main() { // shared: c1 -\u0026gt; c2 -\u0026gt; c3 auto* c1 = new ListNode(8); auto* c2 = new ListNode(4); auto* c3 = new ListNode(5); c1-\u0026gt;next = c2; c2-\u0026gt;next = c3; // A: 4 -\u0026gt; 1 -\u0026gt; c1 auto* a1 = new ListNode(4); auto* a2 = new ListNode(1); a1-\u0026gt;next = a2; a2-\u0026gt;next = c1; // B: 5 -\u0026gt; 6 -\u0026gt; 1 -\u0026gt; c1 auto* b1 = new ListNode(5); auto* b2 = new ListNode(6); auto* b3 = new ListNode(1); b1-\u0026gt;next = b2; b2-\u0026gt;next = b3; b3-\u0026gt;next = c1; ListNode* ans = getIntersectionNode(a1, b1); std::cout \u0026lt;\u0026lt; (ans ? std::to_string(ans-\u0026gt;val) : std::string(\u0026#34;null\u0026#34;)) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; // Demo only: free omitted. return 0; } package main import \u0026#34;fmt\u0026#34; type ListNode struct { Val int Next *ListNode } func getIntersectionNode(headA, headB *ListNode) *ListNode { p, q := headA, headB for p != q { if p == nil { p = headB } else { p = p.Next } if q == nil { q = headA } else { q = q.Next } } return p } func main() { // shared: c1(8) -\u0026gt; c2(4) -\u0026gt; c3(5) c3 := \u0026amp;ListNode{Val: 5} c2 := \u0026amp;ListNode{Val: 4, Next: c3} c1 := \u0026amp;ListNode{Val: 8, Next: c2} // A: 4 -\u0026gt; 1 -\u0026gt; c1 a := \u0026amp;ListNode{Val: 4, Next: \u0026amp;ListNode{Val: 1, Next: c1}} // B: 5 -\u0026gt; 6 -\u0026gt; 1 -\u0026gt; c1 b := \u0026amp;ListNode{Val: 5, Next: \u0026amp;ListNode{Val: 6, Next: \u0026amp;ListNode{Val: 1, Next: c1}}} ans := getIntersectionNode(a, b) if ans != nil { fmt.Println(ans.Val) } else { fmt.Println(\u0026#34;null\u0026#34;) } } use std::cell::RefCell; use std::rc::Rc; #[derive(Debug)] struct ListNode { val: i32, next: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt;, } fn node(val: i32) -\u0026gt; Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt; { Rc::new(RefCell::new(ListNode { val, next: None })) } fn same(a: \u0026amp;Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt;, b: \u0026amp;Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt;) -\u0026gt; bool { match (a, b) { (Some(x), Some(y)) =\u0026gt; Rc::ptr_eq(x, y), (None, None) =\u0026gt; true, _ =\u0026gt; false, } } fn get_intersection_node( head_a: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt;, head_b: Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt;, ) -\u0026gt; Option\u0026lt;Rc\u0026lt;RefCell\u0026lt;ListNode\u0026gt;\u0026gt;\u0026gt; { let mut p = head_a.clone(); let mut q = head_b.clone(); while !same(\u0026amp;p, \u0026amp;q) { p = if let Some(n) = p { n.borrow().next.clone() } else { head_b.clone() }; q = if let Some(n) = q { n.borrow().next.clone() } else { head_a.clone() }; } p } fn main() { // shared: c1(8) -\u0026gt; c2(4) -\u0026gt; c3(5) let c1 = node(8); let c2 = node(4); let c3 = node(5); c1.borrow_mut().next = Some(c2.clone()); c2.borrow_mut().next = Some(c3.clone()); // A: 4 -\u0026gt; 1 -\u0026gt; c1 let a1 = node(4); let a2 = node(1); a1.borrow_mut().next = Some(a2.clone()); a2.borrow_mut().next = Some(c1.clone()); // B: 5 -\u0026gt; 6 -\u0026gt; 1 -\u0026gt; c1 let b1 = node(5); let b2 = node(6); let b3 = node(1); b1.borrow_mut().next = Some(b2.clone()); b2.borrow_mut().next = Some(b3.clone()); b3.borrow_mut().next = Some(c1.clone()); let ans = get_intersection_node(Some(a1), Some(b1)); match ans { Some(n) =\u0026gt; println!(\u0026#34;{}\u0026#34;, n.borrow().val), None =\u0026gt; println!(\u0026#34;null\u0026#34;), } } class ListNode { constructor(val) { this.val = val; this.next = null; } } function getIntersectionNode(headA, headB) { let p = headA; let q = headB; while (p !== q) { p = p ? p.next : headB; q = q ? q.next : headA; } return p; } // demo const c1 = new ListNode(8); const c2 = new ListNode(4); const c3 = new ListNode(5); c1.next = c2; c2.next = c3; const a1 = new ListNode(4); const a2 = new ListNode(1); a1.next = a2; a2.next = c1; const b1 = new ListNode(5); const b2 = new ListNode(6); const b3 = new ListNode(1); b1.next = b2; b2.next = b3; b3.next = c1; const ans = getIntersectionNode(a1, b1); console.log(ans ? ans.val : null); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/160-intersection-of-two-linked-lists/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThe key is not comparing values, but comparing node identity (same object / same address). This ACERS guide explains the naive hash approach, the length-alignment approach, and the most practical switch-head two-pointer template, with runnable multi-language implementations under the no-modification and no-cycle constraints.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10-14 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003elinked list\u003c/code\u003e, \u003ccode\u003etwo pointers\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Intersection of Two Linked Lists, switch heads, O(1) space, LeetCode 160\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Two pointers walk A then B and B then A, guaranteeing meeting at the intersection or both reaching null within m+n steps, with O(m+n) time and O(1) space.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want a reusable linked-list two-pointer template\u003c/li\u003e\n\u003cli\u003eDevelopers who often confuse \u0026ldquo;same value\u0026rdquo; with \u0026ldquo;same node\u0026rdquo;\u003c/li\u003e\n\u003cli\u003eEngineers working with shared tail structures in chain-like data\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eThis problem looks simple, but it forces you to separate three concepts:\u003c/p\u003e","title":"Hot100: Intersection of Two Linked Lists Two-Pointer Switch-Head O(1) Space ACERS Guide"},{"content":" Subtitle / Summary\nThe constraint “the path can start anywhere, but must go downward” makes root-to-leaf DP insufficient. This ACERS guide explains prefix sums on trees: convert any downward path into a difference of two prefix sums, maintain a frequency hash map during one DFS, and finish in O(n).\nReading time: 12–15 min Tags: binary tree, prefix sum, DFS, hash map SEO keywords: Path Sum III, tree prefix sum, prefix-sum hash, LeetCode 437 Meta description: Count downward paths whose sum equals targetSum in O(n) via prefix sum + hash map, with derivation, tradeoffs, and multi-language implementations. Target Readers LeetCode learners who want a reusable “tree + hash map” template People who tend to write O(n²) when the path does not have to start at the root Engineers working with hierarchical data (call traces, org trees) who need “downward segment” statistics Background / Motivation Many “tree path” problems hide a trap: you naturally assume paths start at the root, or end at leaves — but this problem allows the path to start and end at any nodes, as long as the direction is downward (parent → child).\nThat means:\nMaintaining only a “root-to-current” DP state is not enough Enumerating all start nodes degrades to O(n²) in a skewed tree Sliding window does not apply (node values can be negative, so there is no monotonicity) The key skill worth internalizing is:\nTurn “any downward path” into “the difference of two prefix sums on the same DFS path”.\nOnce you own this model, a lot of tree counting problems collapse into the familiar recipe: prefix sum + frequency map.\nCore Concepts Downward path: can only go from parent to child (no backtracking, no cross-branch jumps) Prefix sum: the sum along the path from the root to the current node Difference counting: if curSum - prevSum = target, then prevSum = curSum - target Path-local hash map: the map must represent prefix sums on the current DFS stack; you must undo it on backtracking A — Algorithm (Problem \u0026amp; Algorithm) Problem Restatement Given the root root of a binary tree and an integer targetSum, return the number of downward paths whose node values sum to targetSum. The path does not need to start at the root or end at a leaf, but it must go downward (parent → child).\nInput / Output Name Type Description root TreeNode root of the binary tree targetSum int target path sum return int number of valid downward paths Example 1 10 / \\ 5 -3 / \\ \\ 3 2 11 / \\ \\ 3 -2 1 targetSum = 8 output: 3 explain: 5-\u0026gt;3, 5-\u0026gt;2-\u0026gt;1, -3-\u0026gt;11 Example 2 1 / \\ 2 3 targetSum = 3 output: 2 explain: 1-\u0026gt;2, 3 C — Concepts (Core Ideas) Derivation: from O(n²) enumeration to O(n) prefix sum Naive approach: start DFS from every node\nFor each node start, count downward paths starting at start with sum targetSum. This can degrade to O(n²) on a chain-like tree and repeats work.\nKey observation: any downward path is a contiguous segment of a root-to-current path\nDuring a DFS, we are always standing on one root → current path (the recursion stack). If we define:\ncurSum: prefix sum from root to current node prevSum: prefix sum from root to some ancestor node Then the sum of the downward path “(ancestor’s child) → current” equals:\ncurSum - prevSum To make it equal targetSum, we need:\nprevSum = curSum - targetSum Method choice: frequency map of prefix sums on the current DFS path\nWhen we visit a node: compute curSum add cnt[curSum - targetSum] to the answer (all paths ending at this node) increment cnt[curSum] and recurse to children decrement cnt[curSum] on backtracking (do not leak into sibling branches) Method Category Prefix sum on tree DFS + frequency hash map Backtracking to maintain path-local state Key Invariant (the thing that makes it correct) When processing node x, the map cnt contains prefix sum counts for the path from the root to x’s parent only. That’s why cnt[curSum - targetSum] exactly means “how many ancestors produce a valid downward segment ending at x”.\nInitialization cnt[0] = 1 matters: we treat the “empty prefix” as occurring once, so when curSum == targetSum (a path starting at the root), it is counted.\nPractice Guide / Steps Run DFS with arguments: node, current prefix sum curSum On entry: update curSum += node.val Add cnt[curSum - targetSum] to the answer Record cnt[curSum] += 1 Recurse to left/right and accumulate counts Backtrack: cnt[curSum] -= 1 Return the accumulated answer Runnable Python example (save as path_sum_iii.py):\nfrom typing import Dict, Optional class TreeNode: def __init__(self, val: int = 0, left: Optional[\u0026#34;TreeNode\u0026#34;] = None, right: Optional[\u0026#34;TreeNode\u0026#34;] = None): self.val = val self.left = left self.right = right def path_sum(root: Optional[TreeNode], target_sum: int) -\u0026gt; int: cnt: Dict[int, int] = {0: 1} def dfs(node: Optional[TreeNode], cur: int) -\u0026gt; int: if node is None: return 0 cur += node.val ans = cnt.get(cur - target_sum, 0) cnt[cur] = cnt.get(cur, 0) + 1 ans += dfs(node.left, cur) ans += dfs(node.right, cur) cnt[cur] -= 1 return ans return dfs(root, 0) if __name__ == \u0026#34;__main__\u0026#34;: # Example 2 root = TreeNode(1, TreeNode(2), TreeNode(3)) print(path_sum(root, 3)) # 2 E — Engineering (Real-world Scenarios) The transferable value of this problem is: counting “downward contiguous segments” in hierarchical data.\nIf your data can be modeled as a parent→child tree and each node has an additive value, you can apply the same template.\nScenario 1: Trace tree — count “downward segments with exact total cost” (Go) Background: a request trace forms a tree of spans; each span has a cost (latency) or a score.\nWhy it fits: you may want to count how many downward sub-chains sum to a threshold (feature construction, pattern detection).\npackage main import \u0026#34;fmt\u0026#34; type Span struct { Cost int64 Next []*Span } func countPaths(root *Span, target int64) int64 { cnt := map[int64]int64{0: 1} var dfs func(*Span, int64) int64 dfs = func(node *Span, cur int64) int64 { if node == nil { return 0 } cur += node.Cost ans := cnt[cur-target] cnt[cur]++ for _, ch := range node.Next { ans += dfs(ch, cur) } cnt[cur]-- return ans } return dfs(root, 0) } func main() { root := \u0026amp;Span{Cost: 1, Next: []*Span{{Cost: 2}, {Cost: 3}}} fmt.Println(countPaths(root, 3)) // 2: 1-\u0026gt;2, 3 } Scenario 2: Org tree / directory tree — count “budget segments” (Python) Background: an org tree where each node carries a budget delta or cost.\nWhy it fits: count downward segments whose total equals a target (compliance rules, feature engineering).\nfrom collections import defaultdict class Node: def __init__(self, v, children=None): self.v = v self.children = children or [] def count_paths(root, target): cnt = defaultdict(int) cnt[0] = 1 def dfs(node, cur): if node is None: return 0 cur += node.v ans = cnt[cur - target] cnt[cur] += 1 for ch in node.children: ans += dfs(ch, cur) cnt[cur] -= 1 return ans return dfs(root, 0) if __name__ == \u0026#34;__main__\u0026#34;: root = Node(1, [Node(2), Node(3)]) print(count_paths(root, 3)) Scenario 3: Frontend component tree — count “downward weight segments” (JavaScript) Background: component/menu trees where each node has a weight (exposure score, risk score, cost score).\nWhy it fits: count how many downward segments match a target sum for debugging or rule matching.\nfunction Node(v, children = []) { this.v = v; this.children = children; } function countPaths(root, target) { const cnt = new Map(); cnt.set(0, 1); function dfs(node, cur) { if (!node) return 0; cur += node.v; const need = cur - target; let ans = cnt.get(need) || 0; cnt.set(cur, (cnt.get(cur) || 0) + 1); for (const ch of node.children) ans += dfs(ch, cur); cnt.set(cur, cnt.get(cur) - 1); return ans; } return dfs(root, 0); } const root = new Node(1, [new Node(2), new Node(3)]); console.log(countPaths(root, 3)); R — Reflection (Tradeoffs \u0026amp; Deeper Notes) Complexity Time: O(n)\nEach node is visited once, and hash operations are O(1) amortized. Space: O(h) ~ O(n)\nThe map stores prefix sums along the current root-to-node path; in the worst case, a skewed tree has height h = n. Alternatives Comparison Method Idea Complexity Issue DFS from every node enumerate start nodes worst O(n²) TLE on skewed trees root-to-leaf DP only classic Path Sum DP O(n) violates “start anywhere / end anywhere” prefix sum + hash map difference counting O(n) must backtrack correctly Common Pitfalls (high-frequency mistakes) Forgetting cnt[curSum] -= 1 on backtracking: you mix prefix sums from sibling branches and count invalid cross-branch paths. Trying sliding window: node values can be negative, so the window has no monotonic property. Counting only paths starting at the root: you will miss paths starting in the middle. Overflow in prefix sums: in production, prefer int64/long long for prefix sums. Explanation / Why it works This is essentially the “tree version of LeetCode 560 (Subarray Sum Equals K)”:\nIn arrays: subarray sum = difference of two prefix sums In trees: downward path sum = difference of two prefix sums on the same DFS stack path The only extra requirement is handling branching: the hash map must represent only the current path, so we must do “enter +1, exit -1” to keep the counting scope correct.\nFAQs and Notes Why cnt[0] = 1?\nIt counts paths that start at the root: if curSum == targetSum, then curSum - targetSum == 0, which matches the “empty prefix”.\nCan I write it as iterative DFS?\nYes, but you must model backtracking explicitly (push enter/exit events). Recursion is simpler; if stack depth is a concern, use an explicit stack.\nDoes the path need to end at a leaf?\nNo. We count paths ending at any node, so we add cnt[cur-target] at every node.\nBest Practices Use int64/long long for prefix sums to avoid overflow Keep the meaning of cnt crystal clear: it is path-local (current DFS stack only) Keep a fixed update order: count first → cnt[cur]++ → recurse → cnt[cur]-- Hand-simulate 2–3 steps on a tiny tree to verify backtracking restores the state S — Summary Key Takeaways For “start anywhere, end anywhere, but downward only” tree path counting, think “prefix sum on tree” first Any downward path sum can be written as a difference of two prefix sums on the same DFS path A frequency map of prefix sums lets you count all paths ending at the current node online in O(1) amortized Backtracking the frequency map is the correctness linchpin (prevents cross-branch pollution) Conclusion The key in LeetCode 437 is not “DFS itself”, but modeling it as difference counting on tree prefix sums. Once you see it this way, the solution becomes short, fast, and reusable.\nReferences and Further Reading LeetCode 437. Path Sum III LeetCode 560. Subarray Sum Equals K (same idea on arrays) LeetCode 112/113. Path Sum / Path Sum II (different constraints; good for comparison) A standard DFS backtracking pattern: mutate state on entry, undo on exit Meta Reading time: 12–15 min Tags: binary tree, prefix sum, DFS, LeetCode 437 SEO keywords: Path Sum III, tree prefix sum, prefix-sum hash, LeetCode 437 Meta description: Prefix sum + hash map to count downward paths with sum targetSum, with derivation and multi-language implementations. Call to Action If you want to solidify this template, do these two next:\nLeetCode 560 (array prefix-sum difference counting) LeetCode 112/113 (Path Sum variants to see how constraints change the solution) If you want an ACERS-style write-up of 560 as a “prefix sum template”, tell me.\nMulti-language Reference Implementations (Python / C / C++ / Go / Rust / JS) from typing import Optional, Dict class TreeNode: def __init__(self, val: int = 0, left: Optional[\u0026#34;TreeNode\u0026#34;] = None, right: Optional[\u0026#34;TreeNode\u0026#34;] = None): self.val = val self.left = left self.right = right def pathSum(root: Optional[TreeNode], targetSum: int) -\u0026gt; int: cnt: Dict[int, int] = {0: 1} def dfs(node: Optional[TreeNode], cur: int) -\u0026gt; int: if node is None: return 0 cur += node.val ans = cnt.get(cur - targetSum, 0) cnt[cur] = cnt.get(cur, 0) + 1 ans += dfs(node.left, cur) ans += dfs(node.right, cur) cnt[cur] -= 1 return ans return dfs(root, 0) if __name__ == \u0026#34;__main__\u0026#34;: # Example 1 root = TreeNode( 10, TreeNode( 5, TreeNode(3, TreeNode(3), TreeNode(-2)), TreeNode(2, None, TreeNode(1)), ), TreeNode(-3, None, TreeNode(11)), ) print(pathSum(root, 8)) # 3 #include \u0026lt;stdint.h\u0026gt; #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef long long i64; typedef struct TreeNode { int val; struct TreeNode* left; struct TreeNode* right; } TreeNode; static int count_nodes(const TreeNode* root) { if (!root) return 0; return 1 + count_nodes(root-\u0026gt;left) + count_nodes(root-\u0026gt;right); } static uint64_t mix64(uint64_t x) { x += 0x9e3779b97f4a7c15ULL; x = (x ^ (x \u0026gt;\u0026gt; 30)) * 0xbf58476d1ce4e5b9ULL; x = (x ^ (x \u0026gt;\u0026gt; 27)) * 0x94d049bb133111ebULL; return x ^ (x \u0026gt;\u0026gt; 31); } typedef struct { i64* keys; int* vals; unsigned char* used; size_t cap; } Map; static Map map_new(size_t cap) { Map m; m.cap = cap; m.keys = (i64*)calloc(cap, sizeof(i64)); m.vals = (int*)calloc(cap, sizeof(int)); m.used = (unsigned char*)calloc(cap, sizeof(unsigned char)); return m; } static void map_free(Map* m) { free(m-\u0026gt;keys); free(m-\u0026gt;vals); free(m-\u0026gt;used); } static int map_get(const Map* m, i64 key) { size_t mask = m-\u0026gt;cap - 1; size_t i = (size_t)mix64((uint64_t)key) \u0026amp; mask; while (m-\u0026gt;used[i]) { if (m-\u0026gt;keys[i] == key) return m-\u0026gt;vals[i]; i = (i + 1) \u0026amp; mask; } return 0; } static void map_add(Map* m, i64 key, int delta) { size_t mask = m-\u0026gt;cap - 1; size_t i = (size_t)mix64((uint64_t)key) \u0026amp; mask; while (m-\u0026gt;used[i]) { if (m-\u0026gt;keys[i] == key) { m-\u0026gt;vals[i] += delta; return; } i = (i + 1) \u0026amp; mask; } m-\u0026gt;used[i] = 1; m-\u0026gt;keys[i] = key; m-\u0026gt;vals[i] = delta; } static int dfs(TreeNode* node, i64 cur, i64 target, Map* cnt) { if (!node) return 0; cur += (i64)node-\u0026gt;val; int ans = map_get(cnt, cur - target); map_add(cnt, cur, 1); ans += dfs(node-\u0026gt;left, cur, target, cnt); ans += dfs(node-\u0026gt;right, cur, target, cnt); map_add(cnt, cur, -1); return ans; } static int pathSum(TreeNode* root, int targetSum) { int n = count_nodes(root); size_t cap = 1; while (cap \u0026lt; (size_t)(n * 4 + 8)) cap \u0026lt;\u0026lt;= 1; /* keep load factor low */ Map cnt = map_new(cap); map_add(\u0026amp;cnt, 0, 1); int ans = dfs(root, 0, (i64)targetSum, \u0026amp;cnt); map_free(\u0026amp;cnt); return ans; } static TreeNode* node(int v, TreeNode* l, TreeNode* r) { TreeNode* n = (TreeNode*)malloc(sizeof(TreeNode)); n-\u0026gt;val = v; n-\u0026gt;left = l; n-\u0026gt;right = r; return n; } static void free_tree(TreeNode* root) { if (!root) return; free_tree(root-\u0026gt;left); free_tree(root-\u0026gt;right); free(root); } int main(void) { /* Example 1 */ TreeNode* root = node(10, node(5, node(3, node(3, NULL, NULL), node(-2, NULL, NULL)), node(2, NULL, node(1, NULL, NULL))), node(-3, NULL, node(11, NULL, NULL))); printf(\u0026#34;%d\\n\u0026#34;, pathSum(root, 8)); /* 3 */ free_tree(root); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;unordered_map\u0026gt; struct TreeNode { int val; TreeNode* left; TreeNode* right; explicit TreeNode(int v) : val(v), left(nullptr), right(nullptr) {} }; static int dfs(TreeNode* node, long long cur, long long target, std::unordered_map\u0026lt;long long, int\u0026gt;\u0026amp; cnt) { if (!node) return 0; cur += node-\u0026gt;val; int ans = 0; auto it = cnt.find(cur - target); if (it != cnt.end()) ans += it-\u0026gt;second; cnt[cur] += 1; ans += dfs(node-\u0026gt;left, cur, target, cnt); ans += dfs(node-\u0026gt;right, cur, target, cnt); cnt[cur] -= 1; return ans; } int pathSum(TreeNode* root, int targetSum) { std::unordered_map\u0026lt;long long, int\u0026gt; cnt; cnt[0] = 1; return dfs(root, 0, targetSum, cnt); } int main() { // Example 1 auto* root = new TreeNode(10); root-\u0026gt;left = new TreeNode(5); root-\u0026gt;right = new TreeNode(-3); root-\u0026gt;left-\u0026gt;left = new TreeNode(3); root-\u0026gt;left-\u0026gt;right = new TreeNode(2); root-\u0026gt;right-\u0026gt;right = new TreeNode(11); root-\u0026gt;left-\u0026gt;left-\u0026gt;left = new TreeNode(3); root-\u0026gt;left-\u0026gt;left-\u0026gt;right = new TreeNode(-2); root-\u0026gt;left-\u0026gt;right-\u0026gt;right = new TreeNode(1); std::cout \u0026lt;\u0026lt; pathSum(root, 8) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; // 3 // Omit delete in this demo. In production, release memory properly. return 0; } package main import \u0026#34;fmt\u0026#34; type TreeNode struct { Val int64 Left *TreeNode Right *TreeNode } func pathSum(root *TreeNode, targetSum int64) int64 { cnt := map[int64]int64{0: 1} var dfs func(*TreeNode, int64) int64 dfs = func(node *TreeNode, cur int64) int64 { if node == nil { return 0 } cur += node.Val ans := cnt[cur-targetSum] cnt[cur]++ ans += dfs(node.Left, cur) ans += dfs(node.Right, cur) cnt[cur]-- return ans } return dfs(root, 0) } func main() { // Example 1 root := \u0026amp;TreeNode{Val: 10} root.Left = \u0026amp;TreeNode{Val: 5} root.Right = \u0026amp;TreeNode{Val: -3} root.Left.Left = \u0026amp;TreeNode{Val: 3} root.Left.Right = \u0026amp;TreeNode{Val: 2} root.Right.Right = \u0026amp;TreeNode{Val: 11} root.Left.Left.Left = \u0026amp;TreeNode{Val: 3} root.Left.Left.Right = \u0026amp;TreeNode{Val: -2} root.Left.Right.Right = \u0026amp;TreeNode{Val: 1} fmt.Println(pathSum(root, 8)) // 3 } use std::collections::HashMap; #[derive(Debug)] struct TreeNode { val: i64, left: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, right: Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, } impl TreeNode { fn new(val: i64) -\u0026gt; Self { TreeNode { val, left: None, right: None } } } fn dfs(node: \u0026amp;Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, cur: i64, target: i64, cnt: \u0026amp;mut HashMap\u0026lt;i64, i32\u0026gt;) -\u0026gt; i32 { let Some(n) = node.as_ref() else { return 0 }; let cur = cur + n.val; let mut ans = *cnt.get(\u0026amp;(cur - target)).unwrap_or(\u0026amp;0); *cnt.entry(cur).or_insert(0) += 1; ans += dfs(\u0026amp;n.left, cur, target, cnt); ans += dfs(\u0026amp;n.right, cur, target, cnt); if let Some(v) = cnt.get_mut(\u0026amp;cur) { *v -= 1; } ans } fn path_sum(root: \u0026amp;Option\u0026lt;Box\u0026lt;TreeNode\u0026gt;\u0026gt;, target: i64) -\u0026gt; i32 { let mut cnt: HashMap\u0026lt;i64, i32\u0026gt; = HashMap::new(); cnt.insert(0, 1); dfs(root, 0, target, \u0026amp;mut cnt) } fn main() { // Example 1 let mut root = Box::new(TreeNode::new(10)); root.left = Some(Box::new(TreeNode::new(5))); root.right = Some(Box::new(TreeNode::new(-3))); { let left = root.left.as_mut().unwrap(); left.left = Some(Box::new(TreeNode::new(3))); left.right = Some(Box::new(TreeNode::new(2))); let ll = left.left.as_mut().unwrap(); ll.left = Some(Box::new(TreeNode::new(3))); ll.right = Some(Box::new(TreeNode::new(-2))); let lr = left.right.as_mut().unwrap(); lr.right = Some(Box::new(TreeNode::new(1))); } { let right = root.right.as_mut().unwrap(); right.right = Some(Box::new(TreeNode::new(11))); } let root = Some(root); println!(\u0026#34;{}\u0026#34;, path_sum(\u0026amp;root, 8)); // 3 } function TreeNode(val, left = null, right = null) { this.val = val; this.left = left; this.right = right; } function pathSum(root, targetSum) { const cnt = new Map(); cnt.set(0, 1); function dfs(node, cur) { if (!node) return 0; cur += node.val; let ans = cnt.get(cur - targetSum) || 0; cnt.set(cur, (cnt.get(cur) || 0) + 1); ans += dfs(node.left, cur); ans += dfs(node.right, cur); cnt.set(cur, cnt.get(cur) - 1); return ans; } return dfs(root, 0); } // Example 1 const root = new TreeNode( 10, new TreeNode(5, new TreeNode(3, new TreeNode(3), new TreeNode(-2)), new TreeNode(2, null, new TreeNode(1))), new TreeNode(-3, null, new TreeNode(11)), ); console.log(pathSum(root, 8)); // 3 ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/437-path-sum-iii/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nThe constraint “the path can start anywhere, but must go downward” makes root-to-leaf DP insufficient. This ACERS guide explains \u003cstrong\u003eprefix sums on trees\u003c/strong\u003e: convert any downward path into a difference of two prefix sums, maintain a frequency hash map during one DFS, and finish in O(n).\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12–15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ebinary tree\u003c/code\u003e, \u003ccode\u003eprefix sum\u003c/code\u003e, \u003ccode\u003eDFS\u003c/code\u003e, \u003ccode\u003ehash map\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Path Sum III, tree prefix sum, prefix-sum hash, LeetCode 437\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Count downward paths whose sum equals targetSum in O(n) via prefix sum + hash map, with derivation, tradeoffs, and multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLeetCode learners who want a reusable “tree + hash map” template\u003c/li\u003e\n\u003cli\u003ePeople who tend to write O(n²) when the path does not have to start at the root\u003c/li\u003e\n\u003cli\u003eEngineers working with hierarchical data (call traces, org trees) who need “downward segment” statistics\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMany “tree path” problems hide a trap:\nyou naturally assume paths start at the root, or end at leaves — but this problem allows the path to start and end at \u003cstrong\u003eany nodes\u003c/strong\u003e, as long as the direction is downward (parent → child).\u003c/p\u003e","title":"Path Sum III: Prefix Sum + Hash Map Counting Downward Paths (LeetCode 437) ACERS Guide"},{"content":" Subtitle / Summary\nSpiral traversal looks like “just printing in a fancy order”, but the real difficulty is getting boundaries and invariants right. This ACERS guide gives a reusable boundary-shrinking template and runnable multi-language solutions.\nReading time: 12–15 min Tags: Hot100, matrix, simulation, boundary shrinking SEO keywords: Hot100, Spiral Matrix, clockwise traversal, boundary shrinking, LeetCode 54 Meta description: O(mn) spiral order traversal using boundary shrinking, with pitfalls, engineering scenarios, and runnable code. Target Readers Hot100 learners who want a reliable “matrix simulation” template Intermediate engineers who often get boundary cases wrong Anyone working with grids (visualization, raster data, path generation) Background / Motivation Matrix problems are notorious for being “easy to code, hard to get 100% correct”.\nOne extra loop or one missed boundary check can break single-row/single-column cases or cause duplicated output.\nSpiral Matrix is a great training problem because it forces you to make the loop invariant explicit:\nWhat region is still unvisited? How do we shrink it safely after finishing an edge? If you can express that invariant clearly, the code becomes short and robust.\nCore Concepts Boundaries: top / bottom / left / right define the current unvisited rectangle Layer: each iteration peels one outer “ring” (top row, right col, bottom row, left col) Shrink: after finishing an edge, move the boundary inward (top++, right--, bottom--, left++) Loop invariant: the unvisited region is always top..bottom × left..right A — Algorithm Problem Restatement Given an m × n matrix matrix, return all elements in clockwise spiral order.\nInput / Output Name Type Description matrix int[][] an m × n matrix return int[] elements in clockwise spiral order Example 1 matrix = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] output: [1, 2, 3, 6, 9, 8, 7, 4, 5] Example 2 matrix = [ [ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12] ] output: [1, 2, 3, 4, 8, 12, 11, 10, 9, 5, 6, 7] C — Concepts Thought Process: from visited to boundary shrinking Naive: direction array + visited\nStart at (0,0), move right/down/left/up, rotate when hitting a boundary or visited cell.\nPros: intuitive Cons: needs m×n extra space; more branching and easier to get wrong Key observation: spiral traversal is “peel the onion”\nEach layer is exactly four edges: top row, right column, bottom row, left column.\nChosen method: boundary shrinking (O(1) extra space)\nMaintain top, bottom, left, right. Traverse edges, then shrink the boundary.\nMethod Type Matrix simulation Boundary shrinking Loop invariant + boundary conditions The key invariant (the reason it’s correct) At the start of each loop, the unvisited region is a rectangle:\nrows: top .. bottom cols: left .. right After finishing an edge, we shrink the corresponding boundary by 1:\ntop row done → top += 1 right col done → right -= 1 bottom row done → bottom -= 1 (only if top \u0026lt;= bottom) left col done → left += 1 (only if left \u0026lt;= right) Those two conditional checks are what prevent duplicates when only one row/column remains.\nPractical Steps Handle empty matrix: return [] Initialize boundaries: top=0, bottom=m-1, left=0, right=n-1 While top \u0026lt;= bottom and left \u0026lt;= right: Traverse top row (left → right), top++ Traverse right col (top → bottom), right-- If top \u0026lt;= bottom, traverse bottom row (right → left), bottom-- If left \u0026lt;= right, traverse left col (bottom → top), left++ Return the result Runnable Python example (save as spiral_matrix.py):\nfrom typing import List def spiral_order(matrix: List[List[int]]) -\u0026gt; List[int]: if not matrix or not matrix[0]: return [] m, n = len(matrix), len(matrix[0]) top, bottom, left, right = 0, m - 1, 0, n - 1 res: List[int] = [] while top \u0026lt;= bottom and left \u0026lt;= right: for j in range(left, right + 1): res.append(matrix[top][j]) top += 1 for i in range(top, bottom + 1): res.append(matrix[i][right]) right -= 1 if top \u0026lt;= bottom: for j in range(right, left - 1, -1): res.append(matrix[bottom][j]) bottom -= 1 if left \u0026lt;= right: for i in range(bottom, top - 1, -1): res.append(matrix[i][left]) left += 1 return res if __name__ == \u0026#34;__main__\u0026#34;: print(spiral_order([[1, 2, 3], [4, 5, 6], [7, 8, 9]])) print(spiral_order([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])) E — Engineering The “engineering value” of this problem is that it’s a reusable grid path generator.\nYou can replace “append value” with “emit coordinates”, and map coordinates to any domain object (pixels, tiles, warehouse bins, table cells).\nScenario 1: spiral feature extraction for image/raster patches (Python) Background: flatten a small patch (e.g., 7×7, 11×11) into a 1D feature vector.\nWhy it fits: spiral order preserves “outer-to-inner” structure in the sequence.\nfrom typing import List def spiral_order(matrix: List[List[int]]) -\u0026gt; List[int]: if not matrix or not matrix[0]: return [] m, n = len(matrix), len(matrix[0]) top, bottom, left, right = 0, m - 1, 0, n - 1 res: List[int] = [] while top \u0026lt;= bottom and left \u0026lt;= right: for j in range(left, right + 1): res.append(matrix[top][j]) top += 1 for i in range(top, bottom + 1): res.append(matrix[i][right]) right -= 1 if top \u0026lt;= bottom: for j in range(right, left - 1, -1): res.append(matrix[bottom][j]) bottom -= 1 if left \u0026lt;= right: for i in range(bottom, top - 1, -1): res.append(matrix[i][left]) left += 1 return res print(spiral_order([[0, 0, 1], [0, 1, 1], [1, 1, 1]])) Scenario 2: backend service streaming a grid “layer by layer” (Go) Background: maps/tiles/seat grids often want progressive loading: outer ring first for faster “first screen”.\nWhy it fits: boundary shrinking naturally yields a layer-by-layer traversal.\npackage main import \u0026#34;fmt\u0026#34; func spiralOrder(matrix [][]int) []int { if len(matrix) == 0 || len(matrix[0]) == 0 { return []int{} } m, n := len(matrix), len(matrix[0]) top, bottom, left, right := 0, m-1, 0, n-1 res := make([]int, 0, m*n) for top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right { for j := left; j \u0026lt;= right; j++ { res = append(res, matrix[top][j]) } top++ for i := top; i \u0026lt;= bottom; i++ { res = append(res, matrix[i][right]) } right-- if top \u0026lt;= bottom { for j := right; j \u0026gt;= left; j-- { res = append(res, matrix[bottom][j]) } bottom-- } if left \u0026lt;= right { for i := bottom; i \u0026gt;= top; i-- { res = append(res, matrix[i][left]) } left++ } } return res } func main() { grid := [][]int{{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10, 11, 12}} fmt.Println(spiralOrder(grid)) } Scenario 3: spiral scan path for robots / automation (C) Background: coverage scanning in a discrete grid (inspection, cleaning, sampling).\nWhy it fits: O(1) state; no visited buffer needed (good for embedded).\n#include \u0026lt;stdio.h\u0026gt; static void spiral_path(int m, int n) { int top = 0, bottom = m - 1, left = 0, right = n - 1; while (top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right) { for (int j = left; j \u0026lt;= right; ++j) printf(\u0026#34;(%d,%d) \u0026#34;, top, j); ++top; for (int i = top; i \u0026lt;= bottom; ++i) printf(\u0026#34;(%d,%d) \u0026#34;, i, right); --right; if (top \u0026lt;= bottom) { for (int j = right; j \u0026gt;= left; --j) printf(\u0026#34;(%d,%d) \u0026#34;, bottom, j); --bottom; } if (left \u0026lt;= right) { for (int i = bottom; i \u0026gt;= top; --i) printf(\u0026#34;(%d,%d) \u0026#34;, i, left); ++left; } } printf(\u0026#34;\\\\n\u0026#34;); } int main(void) { spiral_path(3, 4); return 0; } Scenario 4: spiral highlight animation in a grid UI (JavaScript) Background: highlight cells in a spiral order for tutorials/animations on a grid.\nWhy it fits: you only need the traversal order as a frame sequence.\nfunction spiralOrder(matrix) { if (!matrix.length || !matrix[0].length) return []; let top = 0, bottom = matrix.length - 1; let left = 0, right = matrix[0].length - 1; const res = []; while (top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right) { for (let j = left; j \u0026lt;= right; j++) res.push(matrix[top][j]); top++; for (let i = top; i \u0026lt;= bottom; i++) res.push(matrix[i][right]); right--; if (top \u0026lt;= bottom) { for (let j = right; j \u0026gt;= left; j--) res.push(matrix[bottom][j]); bottom--; } if (left \u0026lt;= right) { for (let i = bottom; i \u0026gt;= top; i--) res.push(matrix[i][left]); left++; } } return res; } console.log(spiralOrder([[1, 2, 3], [4, 5, 6], [7, 8, 9]])); R — Reflection Complexity Time: O(mn) (each element is output exactly once) Space: O(1) extra (excluding the output array); visited approach uses O(mn) Alternatives Method Idea Extra Space Typical issues visited + direction turns “walk and turn” O(mn) more conditionals, easy to get off-by-one recursion / per-layer slicing “peel layers” depends may introduce slicing copies or recursion overhead boundary shrinking (this post) traverse 4 edges + shrink O(1) must handle single-row/single-col carefully Why boundary shrinking is engineering-friendly Minimal state: four integers describe the progress No extra matrix: good for memory-constrained environments Easy to batch: each layer is naturally a “batch” for progressive output Explanation / Why it works Think of the unvisited part as a shrinking rectangle:\ntop edge: output row top, then top++ right edge: output col right, then right-- bottom edge: only if top \u0026lt;= bottom, output row bottom, then bottom-- left edge: only if left \u0026lt;= right, output col left, then left++ Those two checks are critical: when only one row or one column is left, you must skip the opposite edges to avoid duplicates.\nCommon Pitfalls and Notes Why do we need if top \u0026lt;= bottom and if left \u0026lt;= right?\nThey handle the “single remaining row/column” cases and prevent duplicates.\nShould we handle empty inputs?\nLeetCode usually guarantees m,n \u0026gt;= 1, but production code should return [] for empty input.\nWhat about jagged arrays (rows with different lengths)?\nThe problem assumes a rectangular matrix. Validate/normalize in real systems.\nHow to output coordinates instead of values?\nReplace res.append(matrix[i][j]) with emitting (i, j) (or pushing into a queue/channel).\nBest Practices Write the invariant first: unvisited region is top..bottom × left..right Shrink boundaries immediately after finishing an edge Keep the two safety checks for bottom/left edges For streaming output, replace “append” with “yield/send” S — Summary Key Takeaways Spiral traversal is “peel layers”, not “random turns” top/bottom/left/right boundaries give O(1) extra space The two checks (top\u0026lt;=bottom, left\u0026lt;=right) are the correctness key The same template works as a generic grid path generator Conclusion The goal is not “pass the sample”, but “never break on single-row/single-col edge cases”.\nOnce you master boundary shrinking, many matrix simulation problems become straightforward.\nReferences \u0026amp; Further Reading LeetCode 54. Spiral Matrix LeetCode 59. Spiral Matrix II (generate spiral matrix) LeetCode 885. Spiral Matrix III (expanding spiral path) Meta Reading time: 12–15 min Tags: Hot100, matrix, simulation, boundary shrinking, LeetCode 54 SEO keywords: Hot100, Spiral Matrix, boundary shrinking, O(mn), LeetCode 54 Meta description: O(mn) spiral traversal using boundary shrinking, with pitfalls and multi-language code. Call to Action Turn the boundary-shrinking logic into your personal “matrix simulation template”.\nNext time you see a spiral/layered traversal, copy the template and only swap the “output action”.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) from typing import List def spiral_order(matrix: List[List[int]]) -\u0026gt; List[int]: if not matrix or not matrix[0]: return [] m, n = len(matrix), len(matrix[0]) top, bottom, left, right = 0, m - 1, 0, n - 1 res: List[int] = [] while top \u0026lt;= bottom and left \u0026lt;= right: for j in range(left, right + 1): res.append(matrix[top][j]) top += 1 for i in range(top, bottom + 1): res.append(matrix[i][right]) right -= 1 if top \u0026lt;= bottom: for j in range(right, left - 1, -1): res.append(matrix[bottom][j]) bottom -= 1 if left \u0026lt;= right: for i in range(bottom, top - 1, -1): res.append(matrix[i][left]) left += 1 return res if __name__ == \u0026#34;__main__\u0026#34;: print(spiral_order([[1, 2, 3], [4, 5, 6], [7, 8, 9]])) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; int* spiral_order(int** matrix, int m, int n, int* returnSize) { if (m \u0026lt;= 0 || n \u0026lt;= 0) { *returnSize = 0; return NULL; } int total = m * n; int* res = (int*)malloc((size_t)total * sizeof(int)); int idx = 0; int top = 0, bottom = m - 1, left = 0, right = n - 1; while (top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right) { for (int j = left; j \u0026lt;= right; ++j) res[idx++] = matrix[top][j]; ++top; for (int i = top; i \u0026lt;= bottom; ++i) res[idx++] = matrix[i][right]; --right; if (top \u0026lt;= bottom) { for (int j = right; j \u0026gt;= left; --j) res[idx++] = matrix[bottom][j]; --bottom; } if (left \u0026lt;= right) { for (int i = bottom; i \u0026gt;= top; --i) res[idx++] = matrix[i][left]; ++left; } } *returnSize = idx; return res; } int main(void) { int a0[] = {1, 2, 3, 4}; int a1[] = {5, 6, 7, 8}; int a2[] = {9, 10, 11, 12}; int* matrix[] = {a0, a1, a2}; int returnSize = 0; int* res = spiral_order(matrix, 3, 4, \u0026amp;returnSize); for (int i = 0; i \u0026lt; returnSize; ++i) { if (i) printf(\u0026#34;, \u0026#34;); printf(\u0026#34;%d\u0026#34;, res[i]); } printf(\u0026#34;\\\\n\u0026#34;); free(res); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; std::vector\u0026lt;int\u0026gt; spiralOrder(const std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt;\u0026amp; matrix) { if (matrix.empty() || matrix[0].empty()) return {}; int m = (int)matrix.size(); int n = (int)matrix[0].size(); int top = 0, bottom = m - 1, left = 0, right = n - 1; std::vector\u0026lt;int\u0026gt; res; res.reserve((size_t)m * (size_t)n); while (top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right) { for (int j = left; j \u0026lt;= right; ++j) res.push_back(matrix[top][j]); ++top; for (int i = top; i \u0026lt;= bottom; ++i) res.push_back(matrix[i][right]); --right; if (top \u0026lt;= bottom) { for (int j = right; j \u0026gt;= left; --j) res.push_back(matrix[bottom][j]); --bottom; } if (left \u0026lt;= right) { for (int i = bottom; i \u0026gt;= top; --i) res.push_back(matrix[i][left]); ++left; } } return res; } int main() { std::vector\u0026lt;std::vector\u0026lt;int\u0026gt;\u0026gt; m = {{1,2,3,4},{5,6,7,8},{9,10,11,12}}; auto res = spiralOrder(m); for (size_t i = 0; i \u0026lt; res.size(); ++i) { if (i) std::cout \u0026lt;\u0026lt; \u0026#34;, \u0026#34;; std::cout \u0026lt;\u0026lt; res[i]; } std::cout \u0026lt;\u0026lt; \\\u0026#34;\\\\n\\\u0026#34;; return 0; } package main import \\\u0026#34;fmt\\\u0026#34; func spiralOrder(matrix [][]int) []int { if len(matrix) == 0 || len(matrix[0]) == 0 { return []int{} } m, n := len(matrix), len(matrix[0]) top, bottom, left, right := 0, m-1, 0, n-1 res := make([]int, 0, m*n) for top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right { for j := left; j \u0026lt;= right; j++ { res = append(res, matrix[top][j]) } top++ for i := top; i \u0026lt;= bottom; i++ { res = append(res, matrix[i][right]) } right-- if top \u0026lt;= bottom { for j := right; j \u0026gt;= left; j-- { res = append(res, matrix[bottom][j]) } bottom-- } if left \u0026lt;= right { for i := bottom; i \u0026gt;= top; i-- { res = append(res, matrix[i][left]) } left++ } } return res } func main() { grid := [][]int{{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10, 11, 12}} fmt.Println(spiralOrder(grid)) } fn spiral_order(matrix: \u0026amp;[Vec\u0026lt;i32\u0026gt;]) -\u0026gt; Vec\u0026lt;i32\u0026gt; { if matrix.is_empty() || matrix[0].is_empty() { return vec![]; } let m = matrix.len() as i32; let n = matrix[0].len() as i32; let (mut top, mut bottom, mut left, mut right) = (0i32, m - 1, 0i32, n - 1); let mut res: Vec\u0026lt;i32\u0026gt; = Vec::with_capacity((m * n) as usize); while top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right { for j in left..=right { res.push(matrix[top as usize][j as usize]); } top += 1; for i in top..=bottom { res.push(matrix[i as usize][right as usize]); } right -= 1; if top \u0026lt;= bottom { for j in (left..=right).rev() { res.push(matrix[bottom as usize][j as usize]); } bottom -= 1; } if left \u0026lt;= right { for i in (top..=bottom).rev() { res.push(matrix[i as usize][left as usize]); } left += 1; } } res } fn main() { let matrix = vec![vec![1, 2, 3, 4], vec![5, 6, 7, 8], vec![9, 10, 11, 12]]; println!(\\\u0026#34;{:?}\\\u0026#34;, spiral_order(\u0026amp;matrix)); } function spiralOrder(matrix) { if (!matrix.length || !matrix[0].length) return []; let top = 0, bottom = matrix.length - 1; let left = 0, right = matrix[0].length - 1; const res = []; while (top \u0026lt;= bottom \u0026amp;\u0026amp; left \u0026lt;= right) { for (let j = left; j \u0026lt;= right; j++) res.push(matrix[top][j]); top++; for (let i = top; i \u0026lt;= bottom; i++) res.push(matrix[i][right]); right--; if (top \u0026lt;= bottom) { for (let j = right; j \u0026gt;= left; j--) res.push(matrix[bottom][j]); bottom--; } if (left \u0026lt;= right) { for (let i = bottom; i \u0026gt;= top; i--) res.push(matrix[i][left]); left++; } } return res; } console.log(spiralOrder([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/54-spiral-matrix/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nSpiral traversal looks like “just printing in a fancy order”, but the real difficulty is getting boundaries and invariants right. This ACERS guide gives a reusable boundary-shrinking template and runnable multi-language solutions.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12–15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003ematrix\u003c/code\u003e, \u003ccode\u003esimulation\u003c/code\u003e, \u003ccode\u003eboundary shrinking\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Hot100, Spiral Matrix, clockwise traversal, boundary shrinking, LeetCode 54\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: O(mn) spiral order traversal using boundary shrinking, with pitfalls, engineering scenarios, and runnable code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners who want a reliable “matrix simulation” template\u003c/li\u003e\n\u003cli\u003eIntermediate engineers who often get boundary cases wrong\u003c/li\u003e\n\u003cli\u003eAnyone working with grids (visualization, raster data, path generation)\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMatrix problems are notorious for being “easy to code, hard to get 100% correct”.\u003cbr\u003e\nOne extra loop or one missed boundary check can break single-row/single-column cases or cause duplicated output.\u003c/p\u003e","title":"Hot100: Spiral Matrix (Boundary Shrinking Simulation ACERS Guide)"},{"content":" Subtitle / Abstract Not a horizontal list. We use two core ideas: dependency path length and resource complexity. Treat \u0026ldquo;path length\u0026rdquo; as how far information can travel, and \u0026ldquo;resource complexity\u0026rdquo; as the hard constraint on trainability. Once you understand both, you can judge when CNN/RNN/LSTM/Transformer fits best and make measurable tradeoffs.\nEstimated reading time: Approx. 27 min Tags: cnn, rnn, lstm, transformer SEO keywords: CNN, RNN, LSTM, Transformer Meta description: Compare CNN, RNN, LSTM, and Transformer via path length and resource complexity. Target readers Beginners who want a fast comparison of major neural architectures Practitioners who must choose a model for production Developers working on sequence modeling or multimodal systems Background / Motivation Choosing a model answers two questions:\nHow far and how long can information travel in a sequence (dependency path length) Can compute and memory budgets support it (resource complexity) This article stays on these two axes and avoids a wide, shallow overview.\nA concrete example: when n = 1024, an RNN needs 1024 sequential steps for one forward pass. A Transformer can achieve global interaction within 6 to 12 layers, but the attention matrix has n^2 = 1,048,576 elements. These two hard facts almost decide the outcome: you are blocked by path length or by memory/throughput. Ignore either axis and you will pay with accuracy or cost.\nFast mastery map (60-120s) Problem shape: images/grids -\u0026gt; CNN; sequences -\u0026gt; RNN/LSTM/Transformer. Core difference: RNN/LSTM path length grows with n; Transformer path length is near 1 but cost grows as n^2. When to use/avoid: n \u0026lt;= 256 and low compute -\u0026gt; LSTM/RNN; n \u0026gt;= 512 and need parallelism -\u0026gt; Transformer; pure vision -\u0026gt; CNN. Complexity keywords: CNN O(HWk^2); RNN O(n d^2) serial; LSTM O(4 n d^2); Transformer O(n^2 d). Common traps: ignore n^2 memory, misjudge dependency range, mismatch masks or shapes. Master-level mental model Core abstraction: treat sequence modeling as information routing on a computation graph. Path length decides reachability; resource complexity decides feasibility. Problem family: local connections (CNN), chain propagation (RNN/LSTM), global similarity aggregation (Transformer) are different graphs with different shortest paths. Isomorphic template: information routing = aggregate(neighbors). RNN uses linear neighbors, CNN uses fixed-radius neighbors, attention uses all-to-all neighbors. Key invariant: if shortest path L grows with n, long dependencies are hard to learn; if interactions are n^2, memory and time costs are unavoidable. Core concepts and terms (only two deep dives) Dependency path length and parallelism: decides whether long dependencies can be modeled. Resource complexity (time/memory) vs n: decides whether the model can be trained or deployed. Key terms (used throughout):\nPath length L: shortest number of edges from position i to j in the computation graph. Sequential steps S: number of ordered steps in a forward pass. RNN has S ~= n, CNN/Transformer has S ~= number of layers. Receptive field R: span covered by CNN in input space, R = 1 + (k - 1) * L (no dilation). Sequence length n / hidden size d: dominant variables in complexity. These four values are enough to write the core formulas for path length and resource complexity.\nA directly usable estimate: If each layer only connects neighbors within radius r, then spanning distance d needs L \u0026gt;= ceil(d / r). Example: r = 2, d = 256 -\u0026gt; L \u0026gt;= 128, which is expensive in depth and gradients.\nProblem abstraction (inputs/outputs) Image input: X in R^{B x C x H x W}, output is classification/detection logits. Sequence input: X in R^{B x n x d}, output is per-step prediction or a sequence representation. Optimization target: maximize accuracy and throughput within compute/memory budget while meeting latency. Typical engineering ranges:\nSequence length: n in [128, 8192], and n \u0026gt;= 1024 is \u0026ldquo;long sequence\u0026rdquo;. Memory budget: 16 to 80 GB per GPU; n \u0026gt;= 4096 often triggers OOM with full attention. Latency target: online inference often wants P95 \u0026lt; 200 ms, amplifying serial bottlenecks. Feasibility and lower bound intuition Path length lower bound: if each layer only connects neighbors within radius r (RNN has r = 1, CNN has r = (k - 1)/2), then spanning distance d needs L \u0026gt;= ceil(d / r) layers. Example: 1D CNN with k = 3, r = 1, covering d = 512 requires L \u0026gt;= 512 layers. Even with k = 5 (r = 2), you still need L \u0026gt;= 256. Depth cost remains huge.\nAttention lower bound: full attention computes similarity for any i, j, which implies at least Omega(n^2) interactions or memory reads. Unless you drop interactions (window, sparse, approximate), this upper bound is unavoidable.\nA common compromise is downsample then attend: If you reduce n from 2048 to 1024, attention cost drops to 1/4, but each token covers more information, effectively changing the \u0026ldquo;path length\u0026rdquo;. You always trade between the two axes: compress length or pay quadratic cost.\nNaive baselines and bottlenecks Baseline 1: RNN on long sequences When n = 1024, you need 1024 sequential steps; GPU utilization is low. Backprop must keep all intermediate states, so training time rises sharply. Baseline 2: shallow CNN for long dependencies With k = 3 and L = 8, receptive field is only R = 17, which is almost blind for n = 512 tasks. Stack more layers to expand R and parameters and training time explode. Even if per-step compute is cheap, sequential steps decide latency: If one step is 0.3 ms, n = 512 RNN forward is about 154 ms. Transformer has only as many steps as layers (for example 6 layers ~ 1.8 ms). This is why baselines are usable but hard to scale.\nKey observation Dependency is not the time order itself, but the strength of relationships between positions. If all positions can \u0026ldquo;see\u0026rdquo; each other in one layer, path length drops from O(n) to O(1). The cost is interaction count jumping from O(n) to O(n^2), i.e., resource complexity.\nDeep concept 1: dependency path length and parallelism (PDKH) 1) Restate the problem (Polya) If information from position i must affect position j, it must travel along the computation graph. The longer the path, the more gradients decay and the slower training becomes.\nTreat each layer-position as a node, and each valid connection as an edge. Path length L is the shortest path length. Short paths mean fast aggregation; long paths mean repeated transforms before information arrives. This is why path length nearly decides whether long dependencies can be learned.\n2) Minimal example (Bentley) Let sequence length n = 6, and position 1 must influence position 6:\nRNN/LSTM: must pass step by step, path length = 5. CNN (k = 3, L = 2): receptive field is 1 + (k - 1)L = 5, still cannot reach position 6. You need L = 3 layers to cover the full range. Transformer: any position can attend within the same layer, path length = 1. Path length and parallelism comparison Structure Path length L (dependency distance d) Parallelism Notes RNN L = d Low Serial dependency, hard to parallelize LSTM L = d Low Gates mitigate gradient decay CNN L \u0026gt;= ceil((d - 1)/(k - 1)) Medium-High Depends on depth and kernel width Transformer L = 1 High Global attention in parallel Sequential step examples (S) Assume n = 1024 for a single forward pass:\nRNN/LSTM: 1024 ordered steps, S ~= 1024. Transformer (6 layers): 6 ordered steps, S ~= 6. CNN (20 layers): 20 ordered steps, S ~= 20. This explains why RNN throughput is low on GPU: not slow ops, but too many serial steps.\nA rough estimate: if each step is ~0.2 ms, S = 1024 gives ~205 ms per forward pass; S = 6 gives ~1.2 ms (ignoring communication and memory bottlenecks).\nWorked example: how many CNN layers for long dependencies? To cover dependency distance d = 512 with k = 3, L \u0026gt;= (d - 1)/(k - 1) = 255.5, so at least 256 layers. This is why CNNs are often replaced by attention for long sequences.\nMicro-trace: n = 4 dependency propagation Sequence [x1, x2, x3, x4], make x1 influence x4:\nRNN: x1 -\u0026gt; h2 -\u0026gt; h3 -\u0026gt; h4, path length = 3. CNN (k = 3, L = 2): layer 1 lets x1 affect {x1, x2}, layer 2 reaches x3 but not x4. Transformer: x1 directly participates in x4 attention, path length = 1. This tiny example shows path length differences exist even at the smallest scale.\n3) Invariants / contracts (Dijkstra/Hoare) To stably capture dependencies of distance d, the graph must provide paths with length L \u0026lt;= d. When L grows with n, long-dependency training becomes substantially harder.\nGradient decay intuition RNN gradients are products of Jacobians: d h_t / d h_{t-k} = product_{i=t-k+1}^{t} J_i. When k is large, the product quickly shrinks or explodes. This is the root of long dependency difficulty. LSTM uses the cell state as a more direct path, but the length is still O(n).\nA numeric intuition: if the average spectral radius is ~0.9, then after 100 steps the gradient magnitude is about 0.9^100 ~= 0.000026. Even at 0.99, 0.99^100 ~= 0.366, still decaying. So the longer the path, the more you rely on gates or residuals to keep training stable.\nDependency span examples (why long dependencies are hard) Copy task: sequence length n = 512, output the first token at the end. RNN/LSTM must carry information for 511 steps. Transformer can connect start and end in one attention pass.\nBracket matching: matching outer parentheses often spans the entire sequence. Such tasks are extremely sensitive to path length and often favor Transformers.\nEstimating dependency span Text: measure dependency distances within a sentence (often \u0026lt; 128). If cross-paragraph dependencies are common, spans can reach 512 or more. Time series: use autocorrelation length as \u0026ldquo;effective memory\u0026rdquo;. Video/visual sequences: span is driven by object trajectories across frames. Practically, use P90 as a \u0026ldquo;safe span\u0026rdquo;: If 90% of dependencies are below 256, CNN/LSTM is often enough. If P90 exceeds 512, Transformer advantages are usually stable.\nOnce you estimate a typical span d, model selection has a direction.\n4) Formalization (Knuth) RNN/LSTM: path length L = |i - j|. 1D CNN: receptive field R = 1 + (k - 1)L, so covering distance d needs L \u0026gt;= (d - 1)/(k - 1). Transformer: one layer connects any positions, L = 1. Parallelism can be seen as \u0026ldquo;how many sequential steps are required\u0026rdquo;: RNN/LSTM need n steps; CNN/Transformer mainly depend on depth. This directly explains why Transformers have high training throughput.\n5) Correctness sketch (Dijkstra/Hoare) RNN state only moves from t-1 to t, so crossing distance d needs d steps. CNN expands receptive field by (k - 1) per layer, so L layers give R = 1 + (k - 1)L. Transformer attention builds global dependencies, so path length is 1. Structure-by-structure deepening (path length view) CNN: Receptive field grows linearly. For k = 3, it goes 3, 5, 7, 9\u0026hellip; At L = 6, R = 13; at L = 20, R = 41. This shows why CNNs need extreme depth for long dependencies.\nWith dilation, the formula becomes R = 1 + (k - 1) * sum d_l. Example: 4 layers with d_l = [1, 2, 4, 8] gives R = 1 + 2 * (1 + 2 + 4 + 8) = 31. This is larger but still far from n = 512. It improves path length but does not change the linear-growth nature.\nRNN: Path length equals time steps. For n = 512, the furthest dependency needs 511 state transfers. Even if per-step compute is cheap, long chains amplify gradient decay.\nLSTM: Gating stabilizes \u0026ldquo;effective memory\u0026rdquo;, but path length is still O(n). In practice, tricks like setting forget bias b_f = 1 extend memory but do not change the order of growth.\nTransformer: Path length is 1, turning long dependencies into global parallel matrix ops. The cost is higher memory and compute (see concept 2).\nHow LSTM gating extends \u0026ldquo;effective memory\u0026rdquo; LSTM centers on cell state c_t with three gates: f_t = sigma(W_f [x_t, h_{t-1}]) (forget gate) i_t = sigma(W_i [x_t, h_{t-1}]) (input gate) o_t = sigma(W_o [x_t, h_{t-1}]) (output gate) c_t = f_t * c_{t-1} + i_t * tanh(W_c [x_t, h_{t-1}])\nHere * is elementwise multiplication. When f_t is near 1, c_t preserves information longer. This explains why LSTM handles medium-long sequences better than vanilla RNN, but it does not change the fact that path length grows with n.\nIf the mean f_t is ~0.95, the memory factor after 200 steps is 0.95^200 ~= 0.00034. Even at 0.99, after 200 steps it is 0.99^200 ~= 0.133. Gates extend \u0026ldquo;effective path length\u0026rdquo; but cannot change the order.\nTransformer \u0026ldquo;short path\u0026rdquo; still needs order signals Attention is permutation-invariant; without positional encoding, Transformer treats the sequence as a set. So path length = 1 does not automatically solve order. Positional encoding is required.\nA common sinusoidal form: PE(pos, 2i) = sin(pos / 10000^(2i/d_model)) PE(pos, 2i+1) = cos(pos / 10000^(2i/d_model))\nFrom a graph view, attention matrix A = softmax(QK^T) is a weighted fully connected graph. Each row sums to 1, output is a convex combination of values. So a single attention layer can route information between any positions, which is why path length collapses to 1.\nWorked example: CNN receptive field growth table (k = 3) Layers L Receptive field R 1 3 2 5 3 7 4 9 8 17 16 33 def receptive_field(k, layers): return 1 + (k - 1) * layers for L in [1, 2, 4, 8, 16]: print(L, receptive_field(3, L)) 6) Thresholds and scale (Knuth) When dependency span \u0026gt; 256, RNN/LSTM often struggle. When span \u0026gt; 512, Transformer advantages become clear. But this also introduces n^2 cost (see concept 2). These thresholds are empirical, not theoretical limits. In speech and short text (n ~ 128-256), LSTM can still be stable. In long documents and code (n \u0026gt;= 512), path length dominates, and if you also need high throughput, attention parallelism becomes valuable.\n7) Counterexamples / failure modes (Bentley/Sedgewick) If the task is local dependency (for example n \u0026lt;= 128 short text classification), Transformer can overfit due to excessive global modeling. In that case LSTM or 1D CNN is often more stable.\nExample: in a sentiment task with n = 64 and only tens of thousands of samples, Transformer capacity can be too high, so path-length advantage does not translate to accuracy.\n8) Engineering reality (Knuth) Shorter path is not always better: Transformer needs positional encoding to express order; RNN/LSTM can still keep stable memory for n = 200-500.\nCommon \u0026ldquo;mitigations\u0026rdquo;:\nCNN uses residual or pyramid structures to extend receptive field. RNN/LSTM uses truncated BPTT to control training cost. Transformer uses relative positional encoding to strengthen locality. All of these shorten the effective path without changing the main structure. Truncated BPTT impact: if you truncate backprop to 256 steps, you effectively cap dependency span at 256. This is fine for speech or short text, but hurts long-doc summarization or code understanding. So truncation length is your \u0026ldquo;engineering path length budget\u0026rdquo;.\nDeep concept 2: resource complexity as n grows (PDKH) 1) Restate the problem (Polya) When n grows, can the model still be trained and deployed? This is decided by time and memory complexity.\nSplit resource complexity into three dimensions:\nCompute (FLOPs): determines training/inference speed. Memory footprint: decides OOM risk. Memory bandwidth: determines whether throughput is limited by reads/writes. Transformer is often not compute-bound; it is memory-bandwidth bound. 2) Minimal example (Bentley) Let n = 2048, d_model = 512, h = 8:\nAttention matrix has n^2 = 4,194,304 elements. One head in FP16 is about 8 MB; 8 heads about 64 MB. Training also needs activations and gradients, often 3 to 5 times peak. Resource estimate (attention weights) If batch is B, heads h, dtype FP16 (2 bytes): memory ~= B * h * n^2 * 2 bytes. Example: B = 4, h = 8, n = 2048: 4 * 8 * 2048^2 * 2 ~= 512 MB (attention weights only).\nThis scales brutally:\nB doubles -\u0026gt; memory doubles. n doubles -\u0026gt; memory becomes 4x. h doubles -\u0026gt; memory doubles. So increasing n from 2k to 4k is often more dangerous than increasing layers. A more practical estimate is to solve for max n: n_max ~= sqrt(memory_budget / (B * h * 2 bytes)). If memory budget is 8 GB, B = 2, h = 8, then n_max ~= sqrt(8 GB / 32 bytes) ~= 16k. But with 4x to 8x peak overhead, real n is usually 3x to 4x smaller.\nn and memory scale (single head FP16) n n^2 elements Approx memory 512 262,144 ~0.5 MB 1024 1,048,576 ~2 MB 2048 4,194,304 ~8 MB 4096 16,777,216 ~32 MB 8192 67,108,864 ~128 MB You must also consider memory bandwidth: At n = 2048, one head\u0026rsquo;s weights are ~8 MB; with 12 layers that is ~96 MB of reads/writes. Training also reads/writes gradients and activations, so bandwidth pressure grows. This is why FlashAttention speeds up by reducing reads/writes.\ndef attn_memory_mb(n, h=8, batch=4, bytes_per_elem=2): return batch * h * n * n * bytes_per_elem / (1024 ** 2) for n in [512, 1024, 2048, 4096]: print(n, f\u0026#34;{attn_memory_mb(n):.1f} MB\u0026#34;) 3) Invariants / contracts (Dijkstra/Hoare) With full attention, you must compute n^2 interactions explicitly or implicitly. Without approximation, this cost cannot be avoided.\n4) Formalization (Knuth) CNN: O(HWk^2) (or O(n k d^2) for sequences) RNN: O(n d^2) (serial) LSTM: O(4 n d^2) Transformer: O(n^2 d) + O(n d^2) (FFN) Rough compute estimate (n = 1024, d = 512) RNN: per-step d^2, total 1024 * 512^2 ~= 268M MACs. LSTM: about 4x, ~1.07B MACs. Transformer attention: n^2 * d_k. If d_k = 64, 1024^2 * 64 ~= 67M MACs, but FFN adds 2 * n * d * d_ff (d_ff = 2048 gives ~2.1B). Conclusion: Transformer bottlenecks are often FFN and attention memory, not pure FLOPs.\nThe dominance boundary is: n^2 d (attention) vs 2 n d d_ff (FFN). Simplify to n \u0026gt; 2 d_ff for attention to dominate. If d_ff = 2048, attention dominates only when n \u0026gt; 4096. This explains why FFN is the bottleneck at moderate length and attention dominates at very long length.\n5) Correctness sketch (Dijkstra/Hoare) Transformer computes QK^T for all token pairs, so time and memory must scale with n^2.\n6) Thresholds and scale (Knuth) n \u0026lt;= 2048: full attention is usually feasible. 2048 \u0026lt; n \u0026lt;= 8192: use FlashAttention or block attention. n \u0026gt; 8192: require sparse/linear attention or retrieval. A practical upper bound: Single GPU 24 GB, B = 2, h = 8, attention weights at n = 4096 are ~512 MB. With activations and optimizer states, you often approach 16 to 24 GB. So n = 4k is already a warning line for single-GPU training.\n7) Counterexamples / failure modes (Bentley/Sedgewick) On a 16 GB single GPU, forcing n = 8k full attention will often OOM or require tiny batch sizes, lowering efficiency.\n8) Engineering reality (Knuth) Common fixes: FlashAttention, block attention, KV cache, gradient checkpointing. These trade engineering complexity for trainability and throughput.\nTraining vs inference complexity Training: full attention needs n^2 matrix, high memory and compute. Inference (autoregressive): with KV cache, each step interacts with history, per-step complexity is ~O(n), and memory is more manageable. This is why Transformer can be barely workable in inference, but needs strong compute and memory optimization during training.\nKV cache memory estimate: memory ~= B * h * n * d_k * 2 bytes. If B = 1, h = 8, n = 4096, d_k = 64, memory is ~4 MB. But if B = 8 or n = 16k, memory grows linearly and must be planned.\nWorked example: n = 1024 vs n = 4096 For single head FP16:\nn = 1024, attention weights ~2 MB. n = 4096, ~32 MB, 16x larger. If B = 4, h = 8, n = 4096, attention weights alone exceed 1 GB, not counting gradients or activations. So \u0026ldquo;double length\u0026rdquo; is not linear cost, it is geometric.\nComplexity and scale summary (two axes together) Structure Path length L Sequential steps S Time complexity (dominant) Memory complexity (dominant) CNN ~(d/(k-1)) ~layers O(n k d^2) or O(HWk^2) O(n d) RNN d ~n O(n d^2) O(n d) LSTM d ~n O(4 n d^2) O(n d) Transformer 1 ~layers O(n^2 d) + O(n d^2) O(n^2) + O(n d) This table puts path length and resource complexity on one plane: short-path structures (Transformer) are resource-heavy; resource-stable structures (RNN/LSTM) suffer on path length.\nRemember sequential steps S are a hard ceiling. Even with more machines, S is hard to parallelize away. For n = 1024, RNN needs 1024 ordered steps; multi-GPU only increases batch, not reduces steps.\nConstant factors and engineering reality (related to the two axes) Operator granularity: RNN has many small matmuls, GPU utilization is low; Transformer has fewer large matmuls, but bandwidth is the bottleneck. Precision and memory: FP16/BF16 halves attention and activation memory, but does not change path length or dependency span. Residuals and caching: residuals shorten effective path but increase activation storage; short-path models rely more on cache and bandwidth. Worked example (trace): same task, path and cost Toy task: n = 8, make token 1 influence token 8. Compare path and cost:\nRNN/LSTM Path length L = 7, must pass 7 state transfers. Sequential steps S = 8, cannot be parallelized.\nCNN (k = 3) Receptive field R = 1 + 2L. L = 1 -\u0026gt; R = 3, L = 2 -\u0026gt; R = 5, L = 3 -\u0026gt; R = 7, L = 4 -\u0026gt; R = 9. Only L \u0026gt;= 4 covers x1 -\u0026gt; x8.\nTransformer (1 layer) Path length L = 1, x1 can directly influence x8. But attention matrix has n^2 = 64 elements per head. This is tiny at n = 8, but jumps to 4,194,304 at n = 2048.\nThis shows: path advantage exists at small scale; resource disadvantage explodes at large scale.\nPractical guide / steps (selection workflow) Estimate dependency span: for text, use dependency or sentence span; frequent cross-paragraph links suggest d \u0026gt;= 512. Estimate sequence length n: use P50/P90/Max because n determines n^2 cost. Check budget: use B * h * n^2 * 2 bytes to estimate attention memory and leave 4x to 8x headroom. Check parallelism needs: if online inference needs P95 \u0026lt; 200 ms, serial models are likely out. Run a light baseline: small CNN/LSTM to verify learnability and set a minimum accuracy bar. Upgrade structure: if d is large and budget allows, move to Transformer; if budget is tight, consider local attention or hybrid models. You can compress this into two must-answer questions:\nDependency span: is the furthest dependency d clearly \u0026gt; 256? Budget: can memory handle B * h * n^2 attention? If both are clear, selection is usually on track.\nA simplified 2x2 decision matrix:\nsmall d + small budget -\u0026gt; CNN/LSTM small d + large budget -\u0026gt; small Transformer or CNN large d + small budget -\u0026gt; local attention or hybrid large d + large budget -\u0026gt; full Transformer If d is unclear, train a small attention model and inspect attention span distribution first.\nSelection guide Dependency span threshold: if d \u0026lt;= 128, CNN or small RNN is often enough; d \u0026gt;= 512 favors Transformer. Sequence length threshold: n \u0026lt;= 256 makes full attention cheap; n \u0026gt;= 2048 requires memory planning. Memory budget threshold: on a 24 GB GPU, B = 2, h = 8, n = 4096 attention weights are ~512 MB. Add activations and optimizer state and you can hit 16 to 24 GB quickly. Implementation complexity tolerance: if your team cannot optimize kernels, use mature implementations (standard Transformer + FlashAttention). Runnable example (minimal contrast) The code below only contrasts structure-level behavior, not training or loss. It helps you see CNN local aggregation, LSTM sequential state, and Transformer global interaction. Run it to observe output shapes.\nimport torch import torch.nn as nn # CNN cnn = nn.Sequential( nn.Conv2d(3, 16, 3, padding=1), nn.ReLU(), nn.AdaptiveAvgPool2d(1), nn.Flatten(), nn.Linear(16, 10), ) img = torch.randn(2, 3, 32, 32) print(\u0026#34;cnn:\u0026#34;, cnn(img).shape) # LSTM lstm = nn.LSTM(input_size=16, hidden_size=32, batch_first=True) seq = torch.randn(2, 5, 16) out, _ = lstm(seq) print(\u0026#34;lstm:\u0026#34;, out.shape) # Transformer encoder = nn.TransformerEncoder( nn.TransformerEncoderLayer(d_model=32, nhead=4, batch_first=True), num_layers=2, ) seq = torch.randn(2, 6, 32) print(\u0026#34;transformer:\u0026#34;, encoder(seq).shape) Explanation and principles (two axes) Dependency path: Transformer is shortest, RNN/LSTM is longest, CNN depends on depth and kernel size. Resource cost: Transformer is the most expensive (n^2), RNN/LSTM cost is linear but serial. Other differences (gates, positional encoding) are reinforcement for these two axes.\nIf you place models on a 2D chart:\nX-axis = path length (shorter to the left) Y-axis = resource complexity (lower down) RNN/LSTM sit lower but to the right, Transformer sits higher but to the left, and CNN position varies with kernel size and depth. The two axes are not independent: reduce n and you lower cost but increase each token\u0026rsquo;s semantic coverage; increase layers and you shorten path but raise compute and training difficulty. Real-world solutions often mix compression, local modules, and a small amount of global attention.\nEngineering scenarios (only 3, tied to two axes) Short text classification (n \u0026lt;= 128): small dependency span -\u0026gt; LSTM or 1D CNN is usually enough. With n \u0026lt;= 128, attention matrix is only ~16k elements, so Transformer advantage is limited. Long document summarization (n \u0026gt;= 1024): large dependency span -\u0026gt; Transformer, but you must manage n^2 cost. At n = 2048, attention weights are 4.2M elements, often requiring FlashAttention or block strategies. Streaming speech recognition: low latency requirement -\u0026gt; CNN + LSTM hybrid is more stable. Serial steps hurt real-time latency, so local CNN compresses first and LSTM preserves mid-range dependencies. Alternatives and tradeoffs (only around the two axes) Full attention vs local attention: Full attention is O(n^2), local window attention is O(n w). If n = 2048, w = 256, cost is ~8x lower, but path length grows to about n/w ~= 8. If dependency span d = 2048 and window w = 256, you need at least L \u0026gt;= ceil(d / w) = 8 layers for global reach. You trade lower memory for deeper paths and harder training. Deeper CNN vs adding attention: With k = 3, CNN needs 256 layers to cover d = 512. Attention can reduce path length to 1 but adds n^2 memory cost. RNN/LSTM vs Transformer: The former has linear resources but long path; the latter has short path but quadratic resources. When n is small and d is not large, RNN/LSTM can have better cost-performance. Increase kernel size vs increase depth: Larger k shortens required depth but compute grows as O(k d^2). More layers keeps small kernels but still increases path length and training difficulty. Skill ladder Master local structures: understand CNN receptive field and path length. Master chain propagation: understand RNN/LSTM state transfer and gradient decay. Master global routing: understand Transformer global interaction and n^2 cost. Extend in practice: for very long n or tight budget, try local attention or hybrid models. Common questions and notes Underestimating n^2 memory leads to failed training. Missing positional encoding makes Transformer unable to represent order. Large LSTM hidden size overfits on small data. Shallow CNNs \u0026ldquo;cannot see\u0026rdquo; global dependencies, so performance stalls. Truncated BPTT too short caps dependencies at 128/256 and hurts long-range tasks. Doubling n without more data raises overfitting risk and memory cost sharply. Looking only at parameter count often underestimates real memory use. Using average n for budgeting is risky: if P90 doubles, attention memory is 4x. Excessive padding wastes attention on useless tokens; use length bucketing. Best practices and recommendations Treat dependency span as the first decision axis. Treat memory/throughput budget as the second axis. Validate learnability with a light baseline, then upgrade. If span is large but budget is small, prefer sparse/block attention or hybrids. For long sequences, compute n percentiles before selecting full attention. Tune n and h first; they have the largest effect on memory and throughput. Track peak memory during training, not just parameter count. For ultra-long sequences, try chunking/downsampling and check accuracy. Log P90 n, peak memory, and throughput in training runs. Summary / Conclusion CNN suits local patterns and vision grids. RNN/LSTM suits short to mid sequences and low compute. Transformer excels at long dependencies and parallel training but has n^2 cost. Model selection hinges on two axes: dependency path length and resource complexity. When d \u0026gt; 512, path length often dominates; when n \u0026gt; 2048, memory dominates. If budget is limited, shorten n or use local attention before deepening the model. References and further reading https://arxiv.org/abs/1409.2329 https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/2010.11929 Call to Action (CTA) Run the same dataset with an LSTM and a Transformer, compare dependency span and memory cost, and write down your conclusion.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/ai/architecture/cnn-rnn-lstm-transformer-comparison/","summary":"Using dependency path length and resource complexity, this article compares CNN, RNN, LSTM, and Transformer, and provides runnable examples plus a selection workflow.","title":"CNN, RNN, LSTM, and Transformer: Differences and When to Use Each"},{"content":" Subtitle / Summary\nTrapping Rain Water is the classic boundary-constraint problem. This ACERS guide explains the two-pointer method, key formulas, and runnable multi-language solutions.\nReading time: 12–15 min Tags: Hot100, two pointers, array SEO keywords: Trapping Rain Water, two pointers, left right max, O(n), Hot100 Meta description: Two-pointer O(n) trapped water solution with engineering scenarios and multi-language code. Target Readers Hot100 learners building core templates Engineers handling capacity/volume constraints Anyone who wants a clean O(n) solution Background / Motivation Trapped water is a proxy for “capacity under boundary constraints.”\nIt appears in cache headroom estimation, buffer overflow analysis, and terrain capacity modeling.\nThe naive O(n^2) method is too slow; the two-pointer approach reduces it to O(n).\nCore Concepts Local water level: water[i] = min(maxLeft[i], maxRight[i]) - height[i] Boundary constraints: the lower side limits water Two pointers: maintain left/right maxima in one pass A — Algorithm Problem Restatement Given an array of non-negative integers representing bar heights (each width 1), compute how much water can be trapped after raining.\nInput / Output Name Type Description height int[] bar heights return int total trapped water Example 1 (official) height = [0,1,0,2,1,0,1,3,2,1,2,1] output = 6 Example 2 (official) height = [4,2,0,3,2,5] output = 9 C — Concepts Key Formula For each index i:\nwater[i] = min(maxLeft[i], maxRight[i]) - height[i] Method Type Two pointers Left/right maxima boundary Intuition The lower of the two boundaries determines the water level.\nIf leftMax \u0026lt;= rightMax, the left side is settled and can be computed safely.\nPractical Steps Initialize l=0, r=n-1, leftMax, rightMax Update leftMax and rightMax each step If leftMax \u0026lt;= rightMax, accumulate leftMax - height[l] and move l Otherwise accumulate rightMax - height[r] and move r Runnable Python example (save as trapping_rain_water.py):\ndef trap(height): if not height: return 0 l, r = 0, len(height) - 1 left_max = right_max = 0 ans = 0 while l \u0026lt; r: left_max = max(left_max, height[l]) right_max = max(right_max, height[r]) if left_max \u0026lt;= right_max: ans += left_max - height[l] l += 1 else: ans += right_max - height[r] r -= 1 return ans if __name__ == \u0026#34;__main__\u0026#34;: print(trap([0,1,0,2,1,0,1,3,2,1,2,1])) print(trap([4,2,0,3,2,5])) E — Engineering Scenario 1: Cache headroom estimation (Python) Background: treat usage as heights and compute “empty capacity” between peaks.\nWhy it fits: identical boundary-constrained volume calculation.\ndef free_capacity(usage): return trap(usage) print(free_capacity([2, 0, 2])) Scenario 2: Terrain cross-section volume (C++) Background: approximate water volume on a 1D elevation slice.\nWhy it fits: left/right maxima are the limiting walls.\n#include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; int trap(const std::vector\u0026lt;int\u0026gt;\u0026amp; h) { if (h.empty()) return 0; int l = 0, r = (int)h.size() - 1; int leftMax = 0, rightMax = 0, ans = 0; while (l \u0026lt; r) { leftMax = std::max(leftMax, h[l]); rightMax = std::max(rightMax, h[r]); if (leftMax \u0026lt;= rightMax) { ans += leftMax - h[l]; ++l; } else { ans += rightMax - h[r]; --r; } } return ans; } int main() { std::cout \u0026lt;\u0026lt; trap({0,1,0,2,1,0,1,3,2,1,2,1}) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } Scenario 3: Backend buffer overflow risk (Go) Background: estimate how much extra load can fit between high-water marks.\nWhy it fits: two-pointer O(n) is fast enough for online checks.\npackage main import \u0026#34;fmt\u0026#34; func trap(height []int) int { if len(height) == 0 { return 0 } l, r := 0, len(height)-1 leftMax, rightMax := 0, 0 ans := 0 for l \u0026lt; r { if height[l] \u0026gt; leftMax { leftMax = height[l] } if height[r] \u0026gt; rightMax { rightMax = height[r] } if leftMax \u0026lt;= rightMax { ans += leftMax - height[l] l++ } else { ans += rightMax - height[r] r-- } } return ans } func main() { fmt.Println(trap([]int{0,1,0,2,1,0,1,3,2,1,2,1})) } R — Reflection Complexity Time: O(n) Space: O(1) Alternatives Method Idea Complexity Drawbacks Brute force scan left/right for each index O(n^2) too slow Precompute arrays store maxLeft/maxRight O(n) extra memory Monotonic stack compute basins O(n) more complex Two pointers online maxima O(n) simplest Why This Is Best in Practice No extra arrays Linear scan, easy to reason about Great for streaming or large datasets Explanation \u0026amp; Rationale Water is bounded by the lower of the two sides.\nBy always processing the side with the smaller boundary, we ensure the water level is fixed.\nThis allows a single pass without missing any contribution.\nFAQs / Pitfalls Why compare leftMax \u0026lt;= rightMax?\nThe smaller boundary determines the water level on that side.\nDo zeros break anything?\nNo, zeros are just low bars.\nAre negative heights allowed?\nThe problem restricts to non-negative heights.\nBest Practices Use the two-pointer variant for O(1) space Use the precompute variant if you want clearer intermediate arrays Make sure indices don’t cross (l \u0026lt; r) S — Summary Key Takeaways Trapped water depends on left/right maxima Two pointers compute it in one pass O(n) time and O(1) space Applicable to capacity and boundary-constrained volume problems Hot100 essential template Conclusion The two-pointer solution is both elegant and production-friendly.\nMastering it gives you a reusable pattern for boundary-constrained volume problems.\nReferences \u0026amp; Further Reading LeetCode 42. Trapping Rain Water Monotonic stack techniques Boundary constraint modeling Meta Reading time: 12–15 min Tags: Hot100, two pointers, array, prefix max SEO keywords: Trapping Rain Water, two pointers, O(n), Hot100 Meta description: Two-pointer O(n) trapped water with engineering scenarios and code. Call to Action If you are working through Hot100, turn this into a template for boundary-constrained problems.\nShare your real-world adaptations in the comments.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) def trap(height): if not height: return 0 l, r = 0, len(height) - 1 left_max = right_max = 0 ans = 0 while l \u0026lt; r: left_max = max(left_max, height[l]) right_max = max(right_max, height[r]) if left_max \u0026lt;= right_max: ans += left_max - height[l] l += 1 else: ans += right_max - height[r] r -= 1 return ans if __name__ == \u0026#34;__main__\u0026#34;: print(trap([0,1,0,2,1,0,1,3,2,1,2,1])) #include \u0026lt;stdio.h\u0026gt; int trap(const int *h, int n) { if (n == 0) return 0; int l = 0, r = n - 1; int leftMax = 0, rightMax = 0, ans = 0; while (l \u0026lt; r) { if (h[l] \u0026gt; leftMax) leftMax = h[l]; if (h[r] \u0026gt; rightMax) rightMax = h[r]; if (leftMax \u0026lt;= rightMax) { ans += leftMax - h[l]; ++l; } else { ans += rightMax - h[r]; --r; } } return ans; } int main(void) { int h[] = {0,1,0,2,1,0,1,3,2,1,2,1}; printf(\u0026#34;%d\\n\u0026#34;, trap(h, 12)); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; int trap(const std::vector\u0026lt;int\u0026gt;\u0026amp; h) { if (h.empty()) return 0; int l = 0, r = (int)h.size() - 1; int leftMax = 0, rightMax = 0, ans = 0; while (l \u0026lt; r) { leftMax = std::max(leftMax, h[l]); rightMax = std::max(rightMax, h[r]); if (leftMax \u0026lt;= rightMax) { ans += leftMax - h[l]; ++l; } else { ans += rightMax - h[r]; --r; } } return ans; } int main() { std::cout \u0026lt;\u0026lt; trap({0,1,0,2,1,0,1,3,2,1,2,1}) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func trap(height []int) int { if len(height) == 0 { return 0 } l, r := 0, len(height)-1 leftMax, rightMax := 0, 0 ans := 0 for l \u0026lt; r { if height[l] \u0026gt; leftMax { leftMax = height[l] } if height[r] \u0026gt; rightMax { rightMax = height[r] } if leftMax \u0026lt;= rightMax { ans += leftMax - height[l] l++ } else { ans += rightMax - height[r] r-- } } return ans } func main() { fmt.Println(trap([]int{0,1,0,2,1,0,1,3,2,1,2,1})) } fn trap(height: \u0026amp;[i32]) -\u0026gt; i32 { if height.is_empty() { return 0; } let mut l: i32 = 0; let mut r: i32 = height.len() as i32 - 1; let mut left_max = 0; let mut right_max = 0; let mut ans = 0; while l \u0026lt; r { let li = l as usize; let ri = r as usize; if height[li] \u0026gt; left_max { left_max = height[li]; } if height[ri] \u0026gt; right_max { right_max = height[ri]; } if left_max \u0026lt;= right_max { ans += left_max - height[li]; l += 1; } else { ans += right_max - height[ri]; r -= 1; } } ans } fn main() { let h = vec![0,1,0,2,1,0,1,3,2,1,2,1]; println!(\u0026#34;{}\u0026#34;, trap(\u0026amp;h)); } function trap(height) { if (height.length === 0) return 0; let l = 0; let r = height.length - 1; let leftMax = 0; let rightMax = 0; let ans = 0; while (l \u0026lt; r) { leftMax = Math.max(leftMax, height[l]); rightMax = Math.max(rightMax, height[r]); if (leftMax \u0026lt;= rightMax) { ans += leftMax - height[l]; l++; } else { ans += rightMax - height[r]; r--; } } return ans; } console.log(trap([0,1,0,2,1,0,1,3,2,1,2,1])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/42-trapping-rain-water/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nTrapping Rain Water is the classic boundary-constraint problem. This ACERS guide explains the two-pointer method, key formulas, and runnable multi-language solutions.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 12–15 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003etwo pointers\u003c/code\u003e, \u003ccode\u003earray\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Trapping Rain Water, two pointers, left right max, O(n), Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Two-pointer O(n) trapped water solution with engineering scenarios and multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners building core templates\u003c/li\u003e\n\u003cli\u003eEngineers handling capacity/volume constraints\u003c/li\u003e\n\u003cli\u003eAnyone who wants a clean O(n) solution\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eTrapped water is a proxy for “capacity under boundary constraints.”\u003cbr\u003e\nIt appears in cache headroom estimation, buffer overflow analysis, and terrain capacity modeling.\u003cbr\u003e\nThe naive O(n^2) method is too slow; the two-pointer approach reduces it to O(n).\u003c/p\u003e","title":"Hot100: Trapping Rain Water (Two Pointers O(n) ACERS Guide)"},{"content":" Subtitle / Summary\nMaximum Subarray is the classic 1D DP / greedy template. This ACERS guide explains Kadane\u0026rsquo;s idea, engineering use cases, and runnable multi-language solutions.\nReading time: 10–12 min Tags: Hot100, dynamic programming, greedy SEO keywords: Maximum Subarray, Kadane, dynamic programming, O(n), Hot100 Meta description: Kadane O(n) maximum subarray sum with engineering scenarios and multi-language code. Target Readers Hot100 learners building stable templates Engineers analyzing peak segments in time series Anyone who wants a clean O(n) solution Background / Motivation Maximum subarray sum appears in P\u0026amp;L streaks, KPI lift windows, anomaly bursts, and throughput gains.\nThe naive O(n^2) enumeration does not scale. Kadane\u0026rsquo;s algorithm solves it in one pass.\nCore Concepts Subarray: contiguous, non-empty segment State: dp[i] = best sum ending at index i Kadane: if the running sum is negative, drop it and restart A — Algorithm Problem Restatement Given an integer array nums, find the contiguous subarray with the largest sum (must contain at least one element) and return the sum.\nInput / Output Name Type Description nums int[] integer array return int maximum subarray sum Example 1 (official) nums = [-2,1,-3,4,-1,2,1,-5,4] output = 6 explanation: subarray [4,-1,2,1] has sum 6 Example 2 (official) nums = [1] output = 1 C — Concepts Key Formula Let dp[i] be the maximum subarray sum ending at i:\ndp[i] = max(nums[i], dp[i-1] + nums[i]) answer = max(dp[i]) Method Type 1D DP Greedy restart when prefix is negative Intuition If the best sum ending at i-1 is negative, extending it only makes the sum worse.\nSo we restart at i.\nPractical Steps Initialize cur = nums[0], best = nums[0] Scan from index 1: cur = max(nums[i], cur + nums[i]) best = max(best, cur) Return best Runnable Python example (save as maximum_subarray.py):\ndef max_subarray(nums): cur = best = nums[0] for x in nums[1:]: cur = max(x, cur + x) best = max(best, cur) return best if __name__ == \u0026#34;__main__\u0026#34;: print(max_subarray([-2, 1, -3, 4, -1, 2, 1, -5, 4])) print(max_subarray([1])) E — Engineering Scenario 1: Profit streak detection (Python, data analysis) Background: daily profit deltas, find the best contiguous streak.\nWhy it fits: Kadane yields the peak gain window in O(n).\ndef best_profit_streak(deltas): cur = best = deltas[0] for x in deltas[1:]: cur = max(x, cur + x) best = max(best, cur) return best print(best_profit_streak([3, -2, 5, -1, 2, -4, 6])) Scenario 2: High-performance metric bursts (C++) Background: find the strongest contiguous spike in CPU delta metrics.\nWhy it fits: O(n) scan is cache-friendly and fast.\n#include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; int maxBurst(const std::vector\u0026lt;int\u0026gt;\u0026amp; deltas) { int cur = deltas[0]; int best = deltas[0]; for (size_t i = 1; i \u0026lt; deltas.size(); ++i) { cur = std::max(deltas[i], cur + deltas[i]); best = std::max(best, cur); } return best; } int main() { std::cout \u0026lt;\u0026lt; maxBurst({3, -2, 5, -1, 2}) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } Scenario 3: Backend throughput improvement window (Go) Background: QPS deltas over time, find the best continuous improvement.\nWhy it fits: Kadane works in a streaming update loop.\npackage main import \u0026#34;fmt\u0026#34; func maxIncrease(deltas []int) int { cur := deltas[0] best := deltas[0] for i := 1; i \u0026lt; len(deltas); i++ { if cur+deltas[i] \u0026gt; deltas[i] { cur += deltas[i] } else { cur = deltas[i] } if cur \u0026gt; best { best = cur } } return best } func main() { fmt.Println(maxIncrease([]int{3, -2, 5, -1, 2})) } Scenario 4: Frontend engagement lift (JavaScript) Background: analyze consecutive engagement deltas to find best campaign window.\nWhy it fits: linear scan in browser or Node.js.\nfunction maxSubArray(nums) { let cur = nums[0]; let best = nums[0]; for (let i = 1; i \u0026lt; nums.length; i++) { cur = Math.max(nums[i], cur + nums[i]); best = Math.max(best, cur); } return best; } console.log(maxSubArray([3, -2, 5, -1, 2])); R — Reflection Complexity Time: O(n) Space: O(1) Alternatives Method Idea Complexity Drawbacks Brute force enumerate all subarrays O(n^2) too slow Prefix sums compute each interval sum O(n^2) still slow Divide \u0026amp; conquer split and merge O(n log n) more complex Kadane 1D DP O(n) simplest and optimal Why This Is Optimal One pass, constant memory Clear correctness argument Easy to embed in streaming pipelines Explanation \u0026amp; Rationale Kadane\u0026rsquo;s algorithm keeps the best sum ending at the current index.\nWhen the running sum turns negative, it cannot help any future subarray, so we restart.\nThis gives the optimal sum with linear complexity.\nFAQs / Pitfalls What if all numbers are negative?\nStill works: the answer is the least negative single element.\nAre empty subarrays allowed?\nNo, the problem requires at least one element.\nDo we need the interval indices?\nIf needed, track start/end when updating cur.\nBest Practices Use two variables (cur, best) instead of full DP arrays Convert complex signals into delta arrays before applying Kadane If you need indices, store a candidate start pointer S — Summary Key Takeaways Maximum Subarray is a classic 1D DP template Kadane drops negative prefixes for optimality O(n) time and O(1) space scale well Works for all-negative arrays without special cases Common in profit, throughput, and metric burst analysis Conclusion Kadane is a must-know pattern for contiguous optimum problems.\nOnce mastered, you can apply it to many real-world sequences.\nReferences \u0026amp; Further Reading LeetCode 53. Maximum Subarray Standard DP textbooks (maximum subarray sum) CLRS discussion of divide-and-conquer vs Kadane Meta Reading time: 10–12 min Tags: Hot100, dynamic programming, greedy, subarray SEO keywords: Maximum Subarray, Kadane, O(n), Hot100 Meta description: Kadane O(n) maximum subarray sum with engineering scenarios. Call to Action If you are working through Hot100, turn Kadane into a reusable template in your toolbox.\nShare your engineering adaptations in the comments.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) def max_subarray(nums): cur = best = nums[0] for x in nums[1:]: cur = max(x, cur + x) best = max(best, cur) return best if __name__ == \u0026#34;__main__\u0026#34;: print(max_subarray([-2, 1, -3, 4, -1, 2, 1, -5, 4])) #include \u0026lt;stdio.h\u0026gt; int max_subarray(const int *nums, int n) { int cur = nums[0]; int best = nums[0]; for (int i = 1; i \u0026lt; n; ++i) { int with_cur = cur + nums[i]; cur = nums[i] \u0026gt; with_cur ? nums[i] : with_cur; if (cur \u0026gt; best) best = cur; } return best; } int main(void) { int nums[] = {-2, 1, -3, 4, -1, 2, 1, -5, 4}; printf(\u0026#34;%d\\n\u0026#34;, max_subarray(nums, 9)); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; int maxSubArray(const std::vector\u0026lt;int\u0026gt;\u0026amp; nums) { int cur = nums[0]; int best = nums[0]; for (size_t i = 1; i \u0026lt; nums.size(); ++i) { cur = std::max(nums[i], cur + nums[i]); best = std::max(best, cur); } return best; } int main() { std::vector\u0026lt;int\u0026gt; nums = {-2, 1, -3, 4, -1, 2, 1, -5, 4}; std::cout \u0026lt;\u0026lt; maxSubArray(nums) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func maxSubArray(nums []int) int { cur := nums[0] best := nums[0] for i := 1; i \u0026lt; len(nums); i++ { if cur+nums[i] \u0026gt; nums[i] { cur += nums[i] } else { cur = nums[i] } if cur \u0026gt; best { best = cur } } return best } func main() { nums := []int{-2, 1, -3, 4, -1, 2, 1, -5, 4} fmt.Println(maxSubArray(nums)) } fn max_subarray(nums: \u0026amp;[i32]) -\u0026gt; i32 { let mut cur = nums[0]; let mut best = nums[0]; for \u0026amp;x in \u0026amp;nums[1..] { let with_cur = cur + x; cur = if x \u0026gt; with_cur { x } else { with_cur }; if cur \u0026gt; best { best = cur; } } best } fn main() { let nums = vec![-2, 1, -3, 4, -1, 2, 1, -5, 4]; println!(\u0026#34;{}\u0026#34;, max_subarray(\u0026amp;nums)); } function maxSubArray(nums) { let cur = nums[0]; let best = nums[0]; for (let i = 1; i \u0026lt; nums.length; i++) { cur = Math.max(nums[i], cur + nums[i]); best = Math.max(best, cur); } return best; } console.log(maxSubArray([-2, 1, -3, 4, -1, 2, 1, -5, 4])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/hot100/53-maximum-subarray/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nMaximum Subarray is the classic 1D DP / greedy template. This ACERS guide explains Kadane\u0026rsquo;s idea, engineering use cases, and runnable multi-language solutions.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10–12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003eHot100\u003c/code\u003e, \u003ccode\u003edynamic programming\u003c/code\u003e, \u003ccode\u003egreedy\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Maximum Subarray, Kadane, dynamic programming, O(n), Hot100\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Kadane O(n) maximum subarray sum with engineering scenarios and multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHot100 learners building stable templates\u003c/li\u003e\n\u003cli\u003eEngineers analyzing peak segments in time series\u003c/li\u003e\n\u003cli\u003eAnyone who wants a clean O(n) solution\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMaximum subarray sum appears in P\u0026amp;L streaks, KPI lift windows, anomaly bursts, and throughput gains.\u003cbr\u003e\nThe naive O(n^2) enumeration does not scale. Kadane\u0026rsquo;s algorithm solves it in one pass.\u003c/p\u003e","title":"Hot100: Maximum Subarray (Kadane O(n) ACERS Guide)"},{"content":" Subtitle / Summary\nA classic event-spacing validation model. This ACERS guide explains the one-pass logic, engineering use cases, and runnable multi-language solutions.\nReading time: 10–12 min Tags: array, two pointers, event spacing SEO keywords: LeetCode 1437, event spacing, O(n) Meta description: One-pass validation for minimum spacing between 1s, with engineering use cases and multi-language code. Target Readers LeetCode learners building stable templates Engineers working on monitoring / risk control / behavior analytics Developers who need spacing or rate-limit validations Background / Motivation Many systems require events to be spaced apart: login failures, alarms, sensitive actions, API calls, etc.\nThis problem maps directly to event spacing validation.\nA one-pass, O(1)-memory solution is ideal for real-time systems.\nCore Concepts Event spacing: at least k zeros between two 1s Online validation: only the last event index is needed Boundary handling: initialize last = -k-1 to avoid special cases A — Algorithm Problem Restatement Given an integer array nums and integer k, return true if every pair of 1s is at least k apart; otherwise return false.\nInput / Output Name Type Description nums int[] binary array with 0/1 k int required minimum spacing return bool whether the spacing rule holds Example 1 nums = [1,0,0,0,1,0,0,1], k = 2 output = true Example 2 nums = [1,0,1], k = 2 output = false C — Concepts Key Observation Track the index of the last seen 1 (last) On each new 1, if i - last \u0026lt;= k → spacing violated Method Type One-pass scan Event spacing validation Greedy with a last pointer Formula Spacing requirement:\n(j - i - 1) \u0026gt;= k ⇔ (j - i) \u0026gt; k So the violation check is:\nif i - last \u0026lt;= k: return false Practical Steps Set last = -k - 1 (so the first 1 always passes) Scan from left to right When seeing 1, check distance to last If too close, return false Otherwise update last and continue Runnable Python example (save as k_length_apart.py):\ndef k_length_apart(nums, k): last = -k - 1 for i, x in enumerate(nums): if x == 1: if i - last \u0026lt;= k: return False last = i return True if __name__ == \u0026#34;__main__\u0026#34;: print(k_length_apart([1, 0, 0, 0, 1, 0, 0, 1], 2)) # True print(k_length_apart([1, 0, 1], 2)) # False E — Engineering Scenario 1: Risk control for login failures (Python) Background: repeated login failures too close together suggest brute force.\nWhy it fits: only the last failure index is required.\ndef check_login_spacing(events, k): last = -k - 1 for i, x in enumerate(events): if x != 1: continue if i - last \u0026lt;= k: return False last = i return True Scenario 2: Monitoring error density (Go) Background: errors shouldn’t occur too frequently in a time window.\nWhy it fits: O(1) memory, stream-friendly.\npackage main import \u0026#34;fmt\u0026#34; func okSpacing(log []int, k int) bool { last := -k - 1 for i, x := range log { if x == 1 { if i-last \u0026lt;= k { return false } last = i } } return true } func main() { fmt.Println(okSpacing([]int{1, 0, 0, 1}, 2)) } Scenario 3: Debounce in embedded systems (C) Background: sensor triggers must be spaced to avoid bouncing.\nWhy it fits: minimal state, fast checks.\n#include \u0026lt;stdio.h\u0026gt; int k_length_apart(const int *a, int n, int k) { int last = -k - 1; for (int i = 0; i \u0026lt; n; ++i) { if (a[i] == 1) { if (i - last \u0026lt;= k) return 0; last = i; } } return 1; } int main(void) { int a[] = {1,0,0,1}; printf(\u0026#34;%d\\n\u0026#34;, k_length_apart(a, 4, 2)); return 0; } Scenario 4: Frontend click throttling (JavaScript) Background: avoid bursts of high-value actions.\nWhy it fits: same spacing model on a click sequence.\nfunction okSpacing(events, k) { let last = -k - 1; for (let i = 0; i \u0026lt; events.length; i++) { if (events[i] === 1) { if (i - last \u0026lt;= k) return false; last = i; } } return true; } console.log(okSpacing([1, 0, 0, 1], 2)); R — Reflection Complexity Time: O(n) Space: O(1) Alternatives Method Idea Complexity Drawbacks Store all 1 indices then validate gaps O(n) extra memory Double loop compare all pairs O(n^2) too slow One-pass keep last index O(n) simplest Why This Is Best Minimal state Works in streaming systems Straightforward correctness Explanation \u0026amp; Rationale Keeping only the last 1 is enough because the constraint is local to consecutive 1s.\nInitializing last = -k-1 creates a “virtual 1” so the first real 1 always passes.\nAny time i - last \u0026lt;= k, the rule is violated.\nFAQs / Pitfalls Why i - last \u0026lt;= k?\nThe requirement (i - last - 1) \u0026gt;= k rearranges to i - last \u0026gt; k.\nIs k = 0 valid?\nYes. It means adjacent 1s are allowed.\nIs the array required to be 0/1?\nThe problem is binary, but the model can be generalized if you define “event” as 1.\nBest Practices Use last = -k-1 to avoid special cases Wrap the logic as a reusable spacing validator Combine with rate-limit checks if needed S — Summary Key Takeaways The task is an event-spacing validation problem Only the last event index is needed Initialization trick simplifies boundary handling One-pass scan gives O(n)/O(1) Useful in risk control, monitoring, and throttling Conclusion This is a simple but powerful template that maps directly to production systems.\nTurn it into a reusable utility and you’ll use it again and again.\nReferences \u0026amp; Further Reading LeetCode 1437. Check If All 1\u0026rsquo;s Are at Least Length K Places Away Rate limiting / debounce / throttle docs Event stream processing basics Meta Reading time: 10–12 min Tags: array, event spacing, monitoring, risk control SEO keywords: LeetCode 1437, event spacing, O(n) Meta description: One-pass minimum spacing validation with engineering use cases. Call to Action If you’re building monitoring or risk-control systems, add this “event spacing” template to your toolkit.\nShare your real-world adaptations in the comments.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) def k_length_apart(nums, k): last = -k - 1 for i, x in enumerate(nums): if x == 1: if i - last \u0026lt;= k: return False last = i return True if __name__ == \u0026#34;__main__\u0026#34;: print(k_length_apart([1, 0, 0, 0, 1, 0, 0, 1], 2)) #include \u0026lt;stdio.h\u0026gt; int k_length_apart(const int *a, int n, int k) { int last = -k - 1; for (int i = 0; i \u0026lt; n; ++i) { if (a[i] == 1) { if (i - last \u0026lt;= k) return 0; last = i; } } return 1; } int main(void) { int a[] = {1,0,0,1}; printf(\u0026#34;%d\\n\u0026#34;, k_length_apart(a, 4, 2)); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; bool kLengthApart(const std::vector\u0026lt;int\u0026gt;\u0026amp; nums, int k) { int last = -k - 1; for (int i = 0; i \u0026lt; (int)nums.size(); ++i) { if (nums[i] == 1) { if (i - last \u0026lt;= k) return false; last = i; } } return true; } int main() { std::cout \u0026lt;\u0026lt; std::boolalpha \u0026lt;\u0026lt; kLengthApart({1,0,0,1}, 2) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func kLengthApart(nums []int, k int) bool { last := -k - 1 for i, x := range nums { if x == 1 { if i-last \u0026lt;= k { return false } last = i } } return true } func main() { fmt.Println(kLengthApart([]int{1, 0, 0, 1}, 2)) } fn k_length_apart(nums: \u0026amp;[i32], k: i32) -\u0026gt; bool { let mut last = -k - 1; for (i, \u0026amp;x) in nums.iter().enumerate() { if x == 1 { let i = i as i32; if i - last \u0026lt;= k { return false; } last = i; } } true } fn main() { let nums = vec![1, 0, 0, 1]; println!(\u0026#34;{}\u0026#34;, k_length_apart(\u0026amp;nums, 2)); } function kLengthApart(nums, k) { let last = -k - 1; for (let i = 0; i \u0026lt; nums.length; i++) { if (nums[i] === 1) { if (i - last \u0026lt;= k) return false; last = i; } } return true; } console.log(kLengthApart([1, 0, 0, 1], 2)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/1437-check-if-all-ones-are-at-least-k-places-away/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nA classic event-spacing validation model. This ACERS guide explains the one-pass logic, engineering use cases, and runnable multi-language solutions.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10–12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003earray\u003c/code\u003e, \u003ccode\u003etwo pointers\u003c/code\u003e, \u003ccode\u003eevent spacing\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: LeetCode 1437, event spacing, O(n)\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: One-pass validation for minimum spacing between 1s, with engineering use cases and multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLeetCode learners building stable templates\u003c/li\u003e\n\u003cli\u003eEngineers working on monitoring / risk control / behavior analytics\u003c/li\u003e\n\u003cli\u003eDevelopers who need spacing or rate-limit validations\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMany systems require events to be spaced apart: login failures, alarms, sensitive actions, API calls, etc.\u003cbr\u003e\nThis problem maps directly to \u003cstrong\u003eevent spacing validation\u003c/strong\u003e.\u003cbr\u003e\nA one-pass, O(1)-memory solution is ideal for real-time systems.\u003c/p\u003e","title":"LeetCode 1437: Check If All 1's Are at Least K Apart (ACERS Guide)"},{"content":" Subtitle / Summary\nA classic bit-manipulation template: determine if a number is a power of two in O(1). This ACERS guide covers the core insight, practical uses, and runnable multi-language implementations.\nReading time: 8–12 min Tags: bit manipulation, binary, math SEO keywords: Power of Two, bit manipulation, binary, O(1), LeetCode 231 Meta description: O(1) power-of-two check using bit tricks, with engineering scenarios and multi-language code. Target Readers LeetCode learners building a bit-manipulation toolkit Backend / systems engineers who need alignment or capacity checks Anyone who wants stable O(1) integer tests Background / Motivation Power-of-two checks show up everywhere: hash table capacities, memory alignment, sharding, FFT window sizes.\nLooping or using floating-point logs is slower and prone to corner-case bugs.\nThe bitwise method is fast, simple, and reliable.\nCore Concepts Binary form: a power of two has exactly one 1 in its binary representation Bitwise AND: n \u0026amp; (n - 1) clears the lowest set bit Positive-only: n must be greater than 0 A — Algorithm Problem Restatement Given an integer n, determine whether it is a power of two.\nReturn true if it is; otherwise, return false.\nInput / Output Name Type Description n int input integer return bool whether n is a power of two Example 1 Input: n = 1 Output: true Explanation: 2^0 = 1 Example 2 Input: n = 12 Output: false Explanation: 12 in binary is 1100, which has multiple 1s C — Concepts Core Insight A power of two has a single 1 bit:\n1 = 0001 2 = 0010 4 = 0100 8 = 1000 If n has exactly one 1, then:\nn = 1000...000 n - 1 = 0111...111 n \u0026amp; (n - 1) = 0 Therefore:\nn is power of two ⟺ n \u0026gt; 0 and (n \u0026amp; (n - 1)) == 0 Method Type Bit manipulation (bit hacks) Constant-time numeric test Practical Steps Reject non-positive values: n \u0026lt;= 0 → false Compute (n \u0026amp; (n - 1)) If the result is 0, return true Runnable Python example (save as power_of_two.py and run python3 power_of_two.py):\ndef is_power_of_two(n: int) -\u0026gt; bool: return n \u0026gt; 0 and (n \u0026amp; (n - 1)) == 0 if __name__ == \u0026#34;__main__\u0026#34;: print(is_power_of_two(1)) # True print(is_power_of_two(12)) # False E — Engineering Scenario 1: Data analysis / signal processing window size (Python) Background: FFT and some convolution routines require power-of-two sizes.\nWhy it fits: one-line validation avoids runtime failures.\ndef is_power_of_two(n: int) -\u0026gt; bool: return n \u0026gt; 0 and (n \u0026amp; (n - 1)) == 0 window = 1024 if not is_power_of_two(window): raise ValueError(\u0026#34;window must be power of two\u0026#34;) print(\u0026#34;ok\u0026#34;) Scenario 2: Memory alignment in allocators (C) Background: memory allocators often align blocks to powers of two.\nWhy it fits: the check is constant-time and branch-light.\n#include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdint.h\u0026gt; int is_pow2(uint32_t x) { return x \u0026gt; 0 \u0026amp;\u0026amp; (x \u0026amp; (x - 1)) == 0; } int main(void) { printf(\u0026#34;%d\\n\u0026#34;, is_pow2(64)); printf(\u0026#34;%d\\n\u0026#34;, is_pow2(48)); return 0; } Scenario 3: Backend sharding / worker count validation (Go) Background: shard counts are often powers of two to enable idx \u0026amp; (n-1) routing.\nWhy it fits: avoids modulo cost and keeps mapping uniform.\npackage main import \u0026#34;fmt\u0026#34; func isPowerOfTwo(n int) bool { return n \u0026gt; 0 \u0026amp;\u0026amp; (n\u0026amp;(n-1)) == 0 } func main() { shards := 16 if !isPowerOfTwo(shards) { panic(\u0026#34;shards must be power of two\u0026#34;) } fmt.Println(\u0026#34;ok\u0026#34;) } R — Reflection Complexity Time: O(1) Space: O(1) Alternative Approaches Method Idea Complexity Drawbacks Divide by 2 loop keep dividing while even O(log n) slower, more code Popcount == 1 count 1-bits O(1) library/intrinsic dependency log2 check check integer log varies floating-point precision Bit trick n \u0026amp; (n - 1) O(1) simplest and robust Why This Is Best in Practice No loops, no divisions, no floating point Stable across languages and integer sizes Matches real-world systems requirements Explanation \u0026amp; Rationale A power of two has a single set bit. Subtracting 1 flips that bit to 0 and turns all lower bits to 1.\nSo only a number with one set bit will satisfy n \u0026amp; (n - 1) == 0.\nWe must additionally check n \u0026gt; 0 to rule out 0 and negatives.\nFAQs / Pitfalls Is n = 0 a power of two?\nNo. Always guard with n \u0026gt; 0.\nWhat about negative numbers?\nThey are not powers of two in this context. The sign bit in two\u0026rsquo;s complement breaks the single-bit property.\nIs log2(n) safe?\nNot reliably—floating-point precision can misclassify large values.\nBest Practices Always include the n \u0026gt; 0 check Encapsulate the logic into a small utility function If you need the nearest power of two, build a separate helper instead of overloading this function S — Summary Key Takeaways A power of two has exactly one 1 bit n \u0026amp; (n - 1) removes the lowest set bit n \u0026gt; 0 is required to exclude 0 and negatives The bit trick is O(1), concise, and reliable Widely used in hashing, alignment, and sharding Conclusion This is a core bit-manipulation template worth memorizing. It shows up repeatedly in systems code and performance-critical logic.\nReferences \u0026amp; Further Reading LeetCode 231. Power of Two LeetCode 191. Number of 1 Bits LeetCode 342. Power of Four Hacker\u0026rsquo;s Delight (bit tricks) Computer Systems: A Programmer\u0026rsquo;s Perspective (binary operations) Meta Reading time: 8–12 min Tags: bit manipulation, binary, math, LeetCode 231 SEO keywords: Power of Two, bit manipulation, binary, O(1) Meta description: O(1) power-of-two check with bit tricks and engineering applications. Call to Action Try converting a few related problems (power of four, count of 1 bits) into your own ACERS templates.\nShare your variants or engineering use cases in the comments.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) def is_power_of_two(n: int) -\u0026gt; bool: return n \u0026gt; 0 and (n \u0026amp; (n - 1)) == 0 if __name__ == \u0026#34;__main__\u0026#34;: print(is_power_of_two(1)) # True print(is_power_of_two(12)) # False #include \u0026lt;stdio.h\u0026gt; int is_power_of_two(int n) { return n \u0026gt; 0 \u0026amp;\u0026amp; (n \u0026amp; (n - 1)) == 0; } int main(void) { printf(\u0026#34;%d\\n\u0026#34;, is_power_of_two(1)); printf(\u0026#34;%d\\n\u0026#34;, is_power_of_two(12)); return 0; } #include \u0026lt;iostream\u0026gt; bool isPowerOfTwo(int n) { return n \u0026gt; 0 \u0026amp;\u0026amp; (n \u0026amp; (n - 1)) == 0; } int main() { std::cout \u0026lt;\u0026lt; std::boolalpha \u0026lt;\u0026lt; isPowerOfTwo(1) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; std::cout \u0026lt;\u0026lt; std::boolalpha \u0026lt;\u0026lt; isPowerOfTwo(12) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func isPowerOfTwo(n int) bool { return n \u0026gt; 0 \u0026amp;\u0026amp; (n\u0026amp;(n-1)) == 0 } func main() { fmt.Println(isPowerOfTwo(1)) fmt.Println(isPowerOfTwo(12)) } fn is_power_of_two(n: i32) -\u0026gt; bool { n \u0026gt; 0 \u0026amp;\u0026amp; (n \u0026amp; (n - 1)) == 0 } fn main() { println!(\u0026#34;{}\u0026#34;, is_power_of_two(1)); println!(\u0026#34;{}\u0026#34;, is_power_of_two(12)); } function isPowerOfTwo(n) { return n \u0026gt; 0 \u0026amp;\u0026amp; (n \u0026amp; (n - 1)) === 0; } console.log(isPowerOfTwo(1)); console.log(isPowerOfTwo(12)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/231-power-of-two/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nA classic bit-manipulation template: determine if a number is a power of two in O(1). This ACERS guide covers the core insight, practical uses, and runnable multi-language implementations.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 8–12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ebit manipulation\u003c/code\u003e, \u003ccode\u003ebinary\u003c/code\u003e, \u003ccode\u003emath\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Power of Two, bit manipulation, binary, O(1), LeetCode 231\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: O(1) power-of-two check using bit tricks, with engineering scenarios and multi-language code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLeetCode learners building a bit-manipulation toolkit\u003c/li\u003e\n\u003cli\u003eBackend / systems engineers who need alignment or capacity checks\u003c/li\u003e\n\u003cli\u003eAnyone who wants stable O(1) integer tests\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003ePower-of-two checks show up everywhere: hash table capacities, memory alignment, sharding, FFT window sizes.\u003cbr\u003e\nLooping or using floating-point logs is slower and prone to corner-case bugs.\u003cbr\u003e\nThe bitwise method is fast, simple, and reliable.\u003c/p\u003e","title":"LeetCode 231: Power of Two (Bit Trick O(1) ACERS Guide)"},{"content":" Subtitle / Summary\nA standard fixed-window counting problem. This ACERS guide explains the sliding-window model, engineering use cases, and runnable multi-language solutions.\nReading time: 10–12 min Tags: sliding window, string, fixed window SEO keywords: Maximum Number of Vowels, Sliding Window, Fixed Window Meta description: Fixed-window sliding count for maximum vowels with engineering applications. Target Readers LeetCode learners who want stable templates Engineers working on windowed metrics Anyone building real-time counters Background / Motivation Many engineering tasks ask: “What is the maximum count in any fixed-length window?”\nRecomputing every window is O(nk). Sliding window updates in O(1) per step, giving O(n).\nCore Concepts Fixed sliding window: length k, move right one step each time Incremental update: add incoming item, remove outgoing item Condition counting: count only items matching a predicate A — Algorithm Problem Restatement Given a string s and an integer k, return the maximum number of vowels in any substring of length k.\nInput / Output Name Type Description s string lowercase letters k int window length return int maximum vowels in any length-k window Example 1 s = \u0026#34;abciiidef\u0026#34;, k = 3 output = 3 Example 2 s = \u0026#34;aeiou\u0026#34;, k = 2 output = 2 C — Concepts Method Type Fixed sliding window + predicate counting.\nKey Model Maintain cnt for the current window:\ncnt = cnt + isVowel(s[i]) - isVowel(s[i-k]) Each step updates in O(1).\nPractical Steps Count vowels in the first window Set ans = cnt Slide: add s[i], remove s[i-k] Update ans with max Return ans Runnable Example (Python) def max_vowels(s: str, k: int) -\u0026gt; int: vowels = set(\u0026#34;aeiou\u0026#34;) cnt = sum(1 for c in s[:k] if c in vowels) ans = cnt for i in range(k, len(s)): if s[i] in vowels: cnt += 1 if s[i - k] in vowels: cnt -= 1 if cnt \u0026gt; ans: ans = cnt return ans if __name__ == \u0026#34;__main__\u0026#34;: print(max_vowels(\u0026#34;abciiidef\u0026#34;, 3)) Run:\npython3 demo.py Explanation \u0026amp; Trade-offs Because the window size is fixed, each move only changes two characters.\nThis makes O(1) updates possible and avoids O(nk) recomputation.\nE — Engineering Scenario 1: Error Peak per Window (Go) Background: Compute the maximum errors in any k-minute window.\nWhy: Same fixed-window counting model.\npackage main import \u0026#34;fmt\u0026#34; func maxErrors(flags []int, k int) int { cnt, ans := 0, 0 for i, x := range flags { if x == 1 { cnt++ } if i \u0026gt;= k \u0026amp;\u0026amp; flags[i-k] == 1 { cnt-- } if i \u0026gt;= k-1 \u0026amp;\u0026amp; cnt \u0026gt; ans { ans = cnt } } return ans } func main() { fmt.Println(maxErrors([]int{0, 1, 1, 0, 1, 0, 1}, 3)) } Scenario 2: Text Feature Peak (Python) Background: Count maximum keyword occurrences in any k-length window.\nWhy: Predicate can be replaced with any condition.\ndef max_keyword(text, k, keywords): s = list(text) cnt = sum(1 for c in s[:k] if c in keywords) ans = cnt for i in range(k, len(s)): if s[i] in keywords: cnt += 1 if s[i - k] in keywords: cnt -= 1 ans = max(ans, cnt) return ans print(max_keyword(\u0026#34;happyxxsadxxhappy\u0026#34;, 5, set(\u0026#34;hs\u0026#34;))) Scenario 3: Frontend Live Highlight (JavaScript) Background: Highlight max sensitive chars in the latest k input.\nWhy: O(1) update per keystroke.\nfunction maxFlag(chars, k, flagSet) { let cnt = 0; for (let i = 0; i \u0026lt; Math.min(k, chars.length); i += 1) { if (flagSet.has(chars[i])) cnt += 1; } let ans = cnt; for (let i = k; i \u0026lt; chars.length; i += 1) { if (flagSet.has(chars[i])) cnt += 1; if (flagSet.has(chars[i - k])) cnt -= 1; ans = Math.max(ans, cnt); } return ans; } console.log(maxFlag(\u0026#34;abciiidef\u0026#34;, 3, new Set([\u0026#34;a\u0026#34;, \u0026#34;e\u0026#34;, \u0026#34;i\u0026#34;, \u0026#34;o\u0026#34;, \u0026#34;u\u0026#34;]))); R — Reflection Complexity Time: O(n) Space: O(1) Alternatives Method Time Space Notes Brute force O(nk) O(1) Too slow for large data Prefix sum O(n) O(n) Extra memory Sliding window O(n) O(1) Best balance Common Pitfalls Updating result before the window is formed Forgetting to remove the outgoing element Inconsistent vowel predicate Why this is optimal You must inspect each character at least once.\nSliding window achieves that lower bound with O(1) updates.\nS — Summary Fixed-window counting is a reusable template Sliding window reduces O(nk) to O(n) The predicate can represent any engineering condition Ideal for streaming or online stats Further Reading LeetCode 1456 Sliding Window Pattern Prefix sum vs window Conclusion This problem is less about vowels and more about a fixed-window counting model.\nOnce memorized, it translates directly to monitoring and analytics.\nReferences https://leetcode.com/problems/maximum-number-of-vowels-in-a-substring-of-given-length/ https://en.wikipedia.org/wiki/Sliding_window_protocol Metadata Reading time: 10–12 min Tags: sliding window, string, fixed window SEO: Maximum Number of Vowels, Sliding Window Meta description: O(n) fixed-window max vowels with engineering use cases. Call to Action Try rewriting one of your monitoring metrics as a fixed-window count.\nIf you want, share your use case and I can help translate it.\nMulti-language Reference (Python / C / C++ / Go / Rust / JS) def max_vowels(s: str, k: int) -\u0026gt; int: vowels = set(\u0026#34;aeiou\u0026#34;) cnt = sum(1 for c in s[:k] if c in vowels) ans = cnt for i in range(k, len(s)): if s[i] in vowels: cnt += 1 if s[i - k] in vowels: cnt -= 1 ans = max(ans, cnt) return ans if __name__ == \u0026#34;__main__\u0026#34;: print(max_vowels(\u0026#34;abciiidef\u0026#34;, 3)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;string.h\u0026gt; static int is_vowel(char c) { return c == \u0026#39;a\u0026#39; || c == \u0026#39;e\u0026#39; || c == \u0026#39;i\u0026#39; || c == \u0026#39;o\u0026#39; || c == \u0026#39;u\u0026#39;; } int max_vowels(const char *s, int k) { int cnt = 0; int ans = 0; int n = (int)strlen(s); for (int i = 0; i \u0026lt; n; ++i) { if (is_vowel(s[i])) cnt++; if (i \u0026gt;= k \u0026amp;\u0026amp; is_vowel(s[i - k])) cnt--; if (i \u0026gt;= k - 1 \u0026amp;\u0026amp; cnt \u0026gt; ans) ans = cnt; } return ans; } int main(void) { printf(\u0026#34;%d\\n\u0026#34;, max_vowels(\u0026#34;abciiidef\u0026#34;, 3)); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;string\u0026gt; static bool isVowel(char c) { return c == \u0026#39;a\u0026#39; || c == \u0026#39;e\u0026#39; || c == \u0026#39;i\u0026#39; || c == \u0026#39;o\u0026#39; || c == \u0026#39;u\u0026#39;; } int maxVowels(const std::string \u0026amp;s, int k) { int cnt = 0, ans = 0; for (int i = 0; i \u0026lt; (int)s.size(); ++i) { if (isVowel(s[i])) cnt++; if (i \u0026gt;= k \u0026amp;\u0026amp; isVowel(s[i - k])) cnt--; if (i \u0026gt;= k - 1 \u0026amp;\u0026amp; cnt \u0026gt; ans) ans = cnt; } return ans; } int main() { std::cout \u0026lt;\u0026lt; maxVowels(\u0026#34;abciiidef\u0026#34;, 3) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func isVowel(c byte) bool { return c == \u0026#39;a\u0026#39; || c == \u0026#39;e\u0026#39; || c == \u0026#39;i\u0026#39; || c == \u0026#39;o\u0026#39; || c == \u0026#39;u\u0026#39; } func maxVowels(s string, k int) int { cnt, ans := 0, 0 for i := 0; i \u0026lt; len(s); i++ { if isVowel(s[i]) { cnt++ } if i \u0026gt;= k \u0026amp;\u0026amp; isVowel(s[i-k]) { cnt-- } if i \u0026gt;= k-1 \u0026amp;\u0026amp; cnt \u0026gt; ans { ans = cnt } } return ans } func main() { fmt.Println(maxVowels(\u0026#34;abciiidef\u0026#34;, 3)) } fn is_vowel(c: u8) -\u0026gt; bool { c == b\u0026#39;a\u0026#39; || c == b\u0026#39;e\u0026#39; || c == b\u0026#39;i\u0026#39; || c == b\u0026#39;o\u0026#39; || c == b\u0026#39;u\u0026#39; } fn max_vowels(s: \u0026amp;str, k: usize) -\u0026gt; i32 { let bytes = s.as_bytes(); let mut cnt: i32 = 0; let mut ans: i32 = 0; for i in 0..bytes.len() { if is_vowel(bytes[i]) { cnt += 1; } if i \u0026gt;= k \u0026amp;\u0026amp; is_vowel(bytes[i - k]) { cnt -= 1; } if i + 1 \u0026gt;= k \u0026amp;\u0026amp; cnt \u0026gt; ans { ans = cnt; } } ans } fn main() { println!(\u0026#34;{}\u0026#34;, max_vowels(\u0026#34;abciiidef\u0026#34;, 3)); } function maxVowels(s, k) { const isVowel = (c) =\u0026gt; \u0026#34;aeiou\u0026#34;.includes(c); let cnt = 0; let ans = 0; for (let i = 0; i \u0026lt; s.length; i += 1) { if (isVowel(s[i])) cnt += 1; if (i \u0026gt;= k \u0026amp;\u0026amp; isVowel(s[i - k])) cnt -= 1; if (i \u0026gt;= k - 1 \u0026amp;\u0026amp; cnt \u0026gt; ans) ans = cnt; } return ans; } console.log(maxVowels(\u0026#34;abciiidef\u0026#34;, 3)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/1456-maximum-number-of-vowels-in-a-substring-of-given-length/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003cbr\u003e\nA standard fixed-window counting problem. This ACERS guide explains the sliding-window model, engineering use cases, and runnable multi-language solutions.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 10–12 min\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003esliding window\u003c/code\u003e, \u003ccode\u003estring\u003c/code\u003e, \u003ccode\u003efixed window\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Maximum Number of Vowels, Sliding Window, Fixed Window\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Fixed-window sliding count for maximum vowels with engineering applications.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLeetCode learners who want stable templates\u003c/li\u003e\n\u003cli\u003eEngineers working on windowed metrics\u003c/li\u003e\n\u003cli\u003eAnyone building real-time counters\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMany engineering tasks ask: “What is the maximum count in any fixed-length window?”\u003cbr\u003e\nRecomputing every window is O(nk). Sliding window updates in O(1) per step, giving O(n).\u003c/p\u003e","title":"LeetCode 1456: Maximum Number of Vowels in a Substring of Given Length (ACERS Guide)"},{"content":"Title LeetCode 239: Sliding Window Maximum (Monotonic Queue ACERS Guide)\nSubtitle / Summary Sliding Window Maximum is the classic combo of sliding window + monotonic queue. This article follows the ACERS template with reusable engineering patterns and multi-language implementations.\nEstimated reading time: 12–15 minutes Tags: sliding window, monotonic queue, array SEO keywords: Sliding Window Maximum, monotonic queue, deque, O(n) Meta description: Monotonic-queue solution for Sliding Window Maximum with engineering practice and multi-language implementations. Target Readers People practicing LeetCode / Hot100 Mid-level developers who want a reusable “sliding window + monotonic queue” template Engineers working on real-time monitoring, log analytics, or risk control Background / Motivation Rolling-window maximum appears everywhere: latency monitoring, price spikes, temperature alerts, real-time smoothing, and many more. The brute-force approach recomputes max per window in O(nk), which is unacceptable for large n. The monotonic queue reduces it to O(n), making it the most practical engineering choice.\nCore Concepts Sliding window: a fixed-length window of size k Monotonic queue: values are kept in decreasing order; the front is always the max Index maintenance: indices let us evict out-of-window elements A — Algorithm (Problem \u0026amp; Algorithm) Problem Restatement Given an integer array nums and a window size k, a sliding window moves from left to right. Each move shifts the window by one. Return the maximum for each window.\nInput / Output Name Type Description nums int[] input array k int window size return int[] max of each window Example 1 nums = [1,3,-1,-3,5,3,6,7], k = 3 output = [3,3,5,5,6,7] Example 2 nums = [1], k = 1 output = [1] C — Concepts (Core Ideas) Method Type Sliding window + monotonic queue.\nKey Invariants Values at indices in the queue are monotonically decreasing The front index always lies within the current window The front element is the window maximum Model Sketch Window moves right: 1) pop front if it is out of window 2) pop from back while value \u0026lt;= new value 3) push new index to back 4) front is the max Practical Steps / Walkthrough Use a deque dq to store indices For each index i: Pop front if dq[0] \u0026lt;= i - k Pop from back while nums[dq[-1]] \u0026lt;= nums[i] Push i If i \u0026gt;= k - 1, record nums[dq[0]] Runnable Example (Python) from collections import deque from typing import List def max_sliding_window(nums: List[int], k: int) -\u0026gt; List[int]: dq = deque() ans = [] for i, x in enumerate(nums): while dq and dq[0] \u0026lt;= i - k: dq.popleft() while dq and nums[dq[-1]] \u0026lt;= x: dq.pop() dq.append(i) if i \u0026gt;= k - 1: ans.append(nums[dq[0]]) return ans if __name__ == \u0026#34;__main__\u0026#34;: print(max_sliding_window([1, 3, -1, -3, 5, 3, 6, 7], 3)) Run:\npython3 demo.py Explanation \u0026amp; Rationale The monotonic queue guarantees:\nEach element enters and leaves the deque at most once. Total operations are O(n). Brute force scans each window in O(k), yielding O(nk). For large n and k, the gap is huge.\nE — Engineering (Applications) Scenario 1: Rolling Highest Price (Python, data analytics) Background: Compute the highest price within the last k days.\nWhy it fits: Long price series need O(n) rolling max.\nfrom collections import deque def rolling_max(prices, k): dq = deque() ans = [] for i, x in enumerate(prices): while dq and dq[0] \u0026lt;= i - k: dq.popleft() while dq and prices[dq[-1]] \u0026lt;= x: dq.pop() dq.append(i) if i \u0026gt;= k - 1: ans.append(prices[dq[0]]) return ans print(rolling_max([10, 12, 9, 14, 11, 15], 3)) Scenario 2: Service Latency Monitoring (Go, backend) Background: Track the max latency in the latest k requests for alerting.\nWhy it fits: O(1) amortized updates in streaming mode.\npackage main import \u0026#34;fmt\u0026#34; func rollingMax(nums []int, k int) []int { dq := make([]int, 0) ans := make([]int, 0) for i, x := range nums { if len(dq) \u0026gt; 0 \u0026amp;\u0026amp; dq[0] \u0026lt;= i-k { dq = dq[1:] } for len(dq) \u0026gt; 0 \u0026amp;\u0026amp; nums[dq[len(dq)-1]] \u0026lt;= x { dq = dq[:len(dq)-1] } dq = append(dq, i) if i \u0026gt;= k-1 { ans = append(ans, nums[dq[0]]) } } return ans } func main() { fmt.Println(rollingMax([]int{120, 98, 110, 140, 105}, 2)) } Scenario 3: Frontend Chart Highlighting (JavaScript, frontend) Background: Highlight the max point in each window on a chart.\nWhy it fits: Pure frontend computation, no backend needed.\nfunction rollingMax(nums, k) { const dq = []; const ans = []; for (let i = 0; i \u0026lt; nums.length; i += 1) { if (dq.length \u0026amp;\u0026amp; dq[0] \u0026lt;= i - k) dq.shift(); while (dq.length \u0026amp;\u0026amp; nums[dq[dq.length - 1]] \u0026lt;= nums[i]) dq.pop(); dq.push(i); if (i \u0026gt;= k - 1) ans.push(nums[dq[0]]); } return ans; } console.log(rollingMax([2, 5, 3, 6, 1, 4], 3)); R — Reflection (Deeper Insight) Complexity Time: O(n) Space: O(k) Alternatives \u0026amp; Trade-offs Method Time Space Notes Brute force O(nk) O(1) Simple but slow Heap / PQ O(n log k) O(k) Requires cleanup of expired elements Monotonic queue O(n) O(k) Optimal approach Common Pitfalls Storing values instead of indices (can’t evict out-of-window elements) Forgetting to pop smaller elements before pushing the new value Off-by-one errors on window boundary (i \u0026gt;= k - 1) Why This Is Optimal Each element is pushed and popped at most once, so total operations are linear.\nCommon Questions \u0026amp; Notes What if k = 1?\nThe result is the original array.\nWhy store indices instead of values?\nYou need indices to know when elements expire.\nWhat if k \u0026gt; len(nums)?\nLeetCode guarantees valid input; in production add boundary checks.\nBest Practices \u0026amp; Tips Keep a reusable monotonic-queue template Use indices to manage window boundaries For large JS arrays, replace shift() with a head pointer for performance For streaming data, keep the queue as a long-lived structure S — Summary The optimal solution uses a monotonic queue The front always holds the window maximum Each element enters and leaves once → O(n) Widely used for monitoring, rolling stats, and real-time metrics Recommended Reading LeetCode 239 — Sliding Window Maximum Monotonic Queue / Deque templates Rolling Aggregation / Streaming Analytics Conclusion The value of Sliding Window Maximum lies in its reusable template. Once you master the monotonic queue, you unlock a class of rolling-statistics problems.\nReferences https://leetcode.com/problems/sliding-window-maximum/ https://en.cppreference.com/w/cpp/container/deque https://docs.python.org/3/library/collections.html#collections.deque https://doc.rust-lang.org/std/collections/struct.VecDeque.html Meta Reading time: 12–15 minutes Tags: sliding window, monotonic queue, array SEO keywords: Sliding Window Maximum, monotonic queue, deque Meta description: Monotonic-queue solution for Sliding Window Maximum with engineering practice and multi-language implementations. CTA If you work on rolling metrics or real-time analytics, keep the monotonic queue as a core template. Share your use cases in the comments.\nMulti-language Implementations (Python / C / C++ / Go / Rust / JS) from collections import deque from typing import List def max_sliding_window(nums: List[int], k: int) -\u0026gt; List[int]: dq = deque() ans = [] for i, x in enumerate(nums): while dq and dq[0] \u0026lt;= i - k: dq.popleft() while dq and nums[dq[-1]] \u0026lt;= x: dq.pop() dq.append(i) if i \u0026gt;= k - 1: ans.append(nums[dq[0]]) return ans if __name__ == \u0026#34;__main__\u0026#34;: print(max_sliding_window([1, 3, -1, -3, 5, 3, 6, 7], 3)) #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; int *max_sliding_window(const int *nums, int n, int k, int *out_len) { if (k \u0026lt;= 0 || n \u0026lt;= 0) { *out_len = 0; return NULL; } int *ans = (int *)malloc(sizeof(int) * (n - k + 1)); int *dq = (int *)malloc(sizeof(int) * n); int head = 0, tail = 0; int idx = 0; for (int i = 0; i \u0026lt; n; ++i) { if (head \u0026lt; tail \u0026amp;\u0026amp; dq[head] \u0026lt;= i - k) head++; while (head \u0026lt; tail \u0026amp;\u0026amp; nums[dq[tail - 1]] \u0026lt;= nums[i]) tail--; dq[tail++] = i; if (i \u0026gt;= k - 1) { ans[idx++] = nums[dq[head]]; } } *out_len = idx; free(dq); return ans; } int main(void) { int nums[] = {1, 3, -1, -3, 5, 3, 6, 7}; int out_len = 0; int *res = max_sliding_window(nums, 8, 3, \u0026amp;out_len); for (int i = 0; i \u0026lt; out_len; ++i) { printf(\u0026#34;%d \u0026#34;, res[i]); } printf(\u0026#34;\\n\u0026#34;); free(res); return 0; } #include \u0026lt;deque\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; std::vector\u0026lt;int\u0026gt; maxSlidingWindow(const std::vector\u0026lt;int\u0026gt; \u0026amp;nums, int k) { std::deque\u0026lt;int\u0026gt; dq; std::vector\u0026lt;int\u0026gt; ans; for (int i = 0; i \u0026lt; (int)nums.size(); ++i) { while (!dq.empty() \u0026amp;\u0026amp; dq.front() \u0026lt;= i - k) dq.pop_front(); while (!dq.empty() \u0026amp;\u0026amp; nums[dq.back()] \u0026lt;= nums[i]) dq.pop_back(); dq.push_back(i); if (i \u0026gt;= k - 1) ans.push_back(nums[dq.front()]); } return ans; } int main() { std::vector\u0026lt;int\u0026gt; nums{1, 3, -1, -3, 5, 3, 6, 7}; auto res = maxSlidingWindow(nums, 3); for (int x : res) std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; std::cout \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func maxSlidingWindow(nums []int, k int) []int { dq := make([]int, 0) ans := make([]int, 0) for i, x := range nums { if len(dq) \u0026gt; 0 \u0026amp;\u0026amp; dq[0] \u0026lt;= i-k { dq = dq[1:] } for len(dq) \u0026gt; 0 \u0026amp;\u0026amp; nums[dq[len(dq)-1]] \u0026lt;= x { dq = dq[:len(dq)-1] } dq = append(dq, i) if i \u0026gt;= k-1 { ans = append(ans, nums[dq[0]]) } } return ans } func main() { fmt.Println(maxSlidingWindow([]int{1, 3, -1, -3, 5, 3, 6, 7}, 3)) } use std::collections::VecDeque; fn max_sliding_window(nums: \u0026amp;[i32], k: usize) -\u0026gt; Vec\u0026lt;i32\u0026gt; { let mut dq: VecDeque\u0026lt;usize\u0026gt; = VecDeque::new(); let mut ans: Vec\u0026lt;i32\u0026gt; = Vec::new(); for (i, \u0026amp;x) in nums.iter().enumerate() { if let Some(\u0026amp;front) = dq.front() { if front + k \u0026lt;= i { dq.pop_front(); } } while let Some(\u0026amp;back) = dq.back() { if nums[back] \u0026lt;= x { dq.pop_back(); } else { break; } } dq.push_back(i); if i + 1 \u0026gt;= k { ans.push(nums[*dq.front().unwrap()]); } } ans } fn main() { let nums = vec![1, 3, -1, -3, 5, 3, 6, 7]; println!(\u0026#34;{:?}\u0026#34;, max_sliding_window(\u0026amp;nums, 3)); } function maxSlidingWindow(nums, k) { const dq = []; const ans = []; for (let i = 0; i \u0026lt; nums.length; i += 1) { if (dq.length \u0026amp;\u0026amp; dq[0] \u0026lt;= i - k) dq.shift(); while (dq.length \u0026amp;\u0026amp; nums[dq[dq.length - 1]] \u0026lt;= nums[i]) dq.pop(); dq.push(i); if (i \u0026gt;= k - 1) ans.push(nums[dq[0]]); } return ans; } console.log(maxSlidingWindow([1, 3, -1, -3, 5, 3, 6, 7], 3)); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/239-sliding-window-maximum/","summary":"\u003ch3 id=\"title\"\u003e\u003cstrong\u003eTitle\u003c/strong\u003e\u003c/h3\u003e\n\u003cp\u003eLeetCode 239: Sliding Window Maximum (Monotonic Queue ACERS Guide)\u003c/p\u003e\n\u003chr\u003e\n\u003ch3 id=\"subtitle--summary\"\u003e\u003cstrong\u003eSubtitle / Summary\u003c/strong\u003e\u003c/h3\u003e\n\u003cp\u003eSliding Window Maximum is the classic combo of sliding window + monotonic queue.\nThis article follows the ACERS template with reusable engineering patterns and multi-language implementations.\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eEstimated reading time\u003c/strong\u003e: 12–15 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003esliding window\u003c/code\u003e, \u003ccode\u003emonotonic queue\u003c/code\u003e, \u003ccode\u003earray\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Sliding Window Maximum, monotonic queue, deque, O(n)\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Monotonic-queue solution for Sliding Window Maximum with engineering practice and multi-language implementations.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget Readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003ePeople practicing LeetCode / Hot100\u003c/li\u003e\n\u003cli\u003eMid-level developers who want a reusable “sliding window + monotonic queue” template\u003c/li\u003e\n\u003cli\u003eEngineers working on real-time monitoring, log analytics, or risk control\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eRolling-window maximum appears everywhere: latency monitoring, price spikes, temperature alerts,\nreal-time smoothing, and many more. The brute-force approach recomputes max per window in O(nk),\nwhich is unacceptable for large n. The monotonic queue reduces it to O(n), making it the most\npractical engineering choice.\u003c/p\u003e","title":"LeetCode 239: Sliding Window Maximum (Monotonic Queue ACERS Guide)"},{"content":"What Is size_t? Why C++ Loops Prefer size_t Over int Subtitle / Abstract When you iterate containers with a for loop, size_t is often safer and closer to the intended meaning than int. This post uses the ACERS structure to explain what size_t is, why it is used, the common pitfalls, and practical patterns for production C++.\nMeta Reading time: 8-10 minutes Tags: C++, size_t, type system, loops, STL SEO keywords: size_t usage, size_t vs int, C++ loop initialization, size_t underflow Meta description: Explain size_t and why loops often use it, with safe patterns and engineering scenarios. Target readers C++ beginners who are new to size_t, sizeof, and container size() return types Mid-level engineers who have seen -Wsign-compare warnings or unsigned underflow bugs Engineers writing cross-platform or high-performance C++ Background / Motivation In C++ code, you often see loops like:\nfor (size_t i = 0; i \u0026lt; vec.size(); ++i) { ... } Common questions:\nWhy not use the more \u0026ldquo;obvious\u0026rdquo; int? What exactly is size_t, and why is it unsigned? Where are the pitfalls? This article answers those questions.\nA - Algorithm (Problem and Approach) The question Why use size_t for loop indices and sizes instead of int in C++?\nThis is fundamentally about type semantics and API consistency:\nsize_t is the standard type for object sizes and indices int is a signed counter with different semantics Basic example 1: container size and index #include \u0026lt;vector\u0026gt; std::vector\u0026lt;int\u0026gt; v{1, 2, 3}; for (std::size_t i = 0; i \u0026lt; v.size(); ++i) { // i matches v.size() type; no signed/unsigned warning } Basic example 2: unsigned underflow #include \u0026lt;cstddef\u0026gt; std::size_t n = 0; std::size_t x = n - 1; // not -1, but a very large positive number Concept sketch:\nsize_t (unsigned) : 0 ---------------------\u0026gt; SIZE_MAX int (signed) : -2^(N-1) ---- 0 ---- 2^(N-1)-1 Key point: size_t cannot represent negative numbers; subtraction can wrap to a huge value.\nC - Concepts (Core Ideas) What is size_t? size_t is an unsigned integer type that can represent the size of any object. sizeof returns size_t. On 64-bit systems it is typically 64-bit; on 32-bit systems it is typically 32-bit. #include \u0026lt;cstddef\u0026gt; std::size_t n = sizeof(int); What category does this belong to? Type semantics: use types to express \u0026ldquo;size/index\u0026rdquo; API consistency: matches container size() signatures Portability: guaranteed to represent any object size Key model sizeof(T) -\u0026gt; size_t Range: 0 \u0026lt;= size_t \u0026lt;= SIZE_MAX SIZE_MAX = 2^N - 1 (N is the bit width) Practical steps (with commands) Include the header: #include \u0026lt;cstddef\u0026gt; for std::size_t. Align with API: use std::size_t or container::size_type for sizes/indices. Cache bounds: store n = v.size() to avoid repeated calls and unsigned pitfalls. Avoid unsigned underflow: do not write v.size() - 1 on possibly empty containers. Reverse iteration: use for (size_t i = n; i-- \u0026gt; 0;) or std::ssize. Enable warnings: -Wsign-compare to surface issues early. # g++ example g++ -std=c++20 -Wall -Wextra -Wsign-compare main.cpp -o demo ./demo Runnable example: safe size_t loops #include \u0026lt;cstddef\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;utility\u0026gt; #include \u0026lt;vector\u0026gt; int main() { std::vector\u0026lt;int\u0026gt; a{5, 2, 4, 6, 1}; for (std::size_t i = 0; i + 1 \u0026lt; a.size(); ++i) { bool swapped = false; std::size_t n = a.size() - i; for (std::size_t j = 0; j + 1 \u0026lt; n; ++j) { if (a[j] \u0026gt; a[j + 1]) { std::swap(a[j], a[j + 1]); swapped = true; } } if (!swapped) break; } for (int x : a) std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#39; \u0026#39;; std::cout \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; // Safe reverse iteration for (std::size_t i = a.size(); i-- \u0026gt; 0; ) { std::cout \u0026lt;\u0026lt; a[i] \u0026lt;\u0026lt; \u0026#39; \u0026#39;; } std::cout \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; } Why size_t is the better fit Clearer semantics: size_t means \u0026ldquo;size/length\u0026rdquo;, int means \u0026ldquo;signed count\u0026rdquo;. Larger range: on 64-bit systems, int is usually 32-bit and may overflow on huge containers. API matching: vector::size() and string::size() return size_t. Fewer implicit conversions: mixing int and size_t triggers -Wsign-compare and can break logic. E - Engineering (Real-world Usage) Below are three real engineering scenarios with background, rationale, and runnable examples.\nScenario 1: Large-scale batch processing (C++) Background: At billion-scale data, container sizes can exceed 2^31. Why it fits: size_t can represent the range and aligns with STL.\n#include \u0026lt;cstddef\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;vector\u0026gt; int main() { std::vector\u0026lt;int\u0026gt; data(5, 1); std::size_t sum = 0; for (std::size_t i = 0; i \u0026lt; data.size(); ++i) { sum += static_cast\u0026lt;std::size_t\u0026gt;(data[i]); } std::cout \u0026lt;\u0026lt; sum \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; } Scenario 2: Memory allocation and buffers (C) Background: C APIs like malloc and memcpy use size_t for byte counts. Why it fits: consistent across platforms and safe for large allocations.\n#include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; int main(void) { size_t n = 5; int *p = (int*)malloc(n * sizeof(int)); if (!p) return 1; for (size_t i = 0; i \u0026lt; n; ++i) p[i] = (int)i; for (size_t i = 0; i \u0026lt; n; ++i) printf(\u0026#34;%d \u0026#34;, p[i]); printf(\u0026#34;\\n\u0026#34;); free(p); return 0; } Scenario 3: Cross-platform library APIs (C++) Background: API functions take buffer length parameters. Why it fits: size_t is the universal size type for callers on different platforms.\n#include \u0026lt;cstddef\u0026gt; #include \u0026lt;cstdint\u0026gt; #include \u0026lt;iostream\u0026gt; std::uint8_t checksum(const std::uint8_t* buf, std::size_t len) { std::uint8_t acc = 0; for (std::size_t i = 0; i \u0026lt; len; ++i) { acc ^= buf[i]; } return acc; } int main() { std::uint8_t payload[] = {1, 2, 3, 4}; std::cout \u0026lt;\u0026lt; static_cast\u0026lt;int\u0026gt;(checksum(payload, sizeof(payload))) \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; } R - Reflection (Deep Dive) Time and space complexity The loop examples are typically O(n) time O(1) extra space This is independent of int vs size_t; the difference is correctness and maintainability.\nAlternative approaches Option Pros Cons Use cases int index Simple Small range, signed/unsigned mismatch Small data, teaching examples size_t index Large range, API match Unsigned underflow risk Most size/index cases std::ssize Signed, safe reverse Requires C++20 When negative values are meaningful Iterators/range for Safest No index When you do not need indices Why this approach is most practical\nsize_t is the standard size type with best compatibility. Safe patterns avoid underflow pitfalls. Aligns naturally with STL APIs and avoids warnings. Common questions and pitfalls Is size_t always 64-bit? No, it depends on platform width. Is auto i = 0 OK? It deduces int, not size_t. Why is v.size() - 1 dangerous? Underflows on empty containers. Why is for (size_t i = n - 1; i \u0026gt;= 0; --i) wrong? i \u0026gt;= 0 is always true for unsigned. Does int avoid underflow? It avoids unsigned underflow but introduces range and conversion risks. Best practices Prefer std::size_t or container::size_type for sizes and indices. Cache n = v.size() to avoid repeated calls and reduce risk. For reverse loops use for (size_t i = n; i-- \u0026gt; 0;) or std::ssize. Use range-for if you do not need indices. Enable -Wsign-compare to surface bugs early. S - Summary Key takeaways size_t is the standard type for object size and index; sizeof returns it. It matches vector::size() and avoids signed/unsigned mismatch. Its range is larger than int on 64-bit systems. Unsigned subtraction can underflow; write conditions to avoid it. Reverse iteration has safe patterns; do not use i \u0026gt;= 0 with unsigned. References and further reading C++ reference: std::size_t: https://en.cppreference.com/w/cpp/types/size_t C++ reference: std::ssize: https://en.cppreference.com/w/cpp/iterator/ssize ISO C standard: size_t: https://en.cppreference.com/w/c/types/size_t Conclusion size_t is not a mysterious type. It is the standard way C/C++ expresses sizes and indices. If you avoid unsigned underflow and use safe loop conditions, it is more robust and more consistent than int. Consider enabling -Wsign-compare and cleaning up mixed-sign usage in your codebase.\nCall to Action (CTA) Search your codebase for places where size() is mixed with int, switch to size_t, and run tests. If you have hit a bug related to this, share the case and learnings.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/c++/size_t-why-not-int-loop/","summary":"\u003ch1 id=\"what-is-size_\"\u003e\u003cstrong\u003eWhat Is size_t? Why C++ Loops Prefer size_t Over int\u003c/strong\u003e\u003c/h1\u003e\n\u003ch3 id=\"subtitle--abstract\"\u003eSubtitle / Abstract\u003c/h3\u003e\n\u003cp\u003eWhen you iterate containers with a \u003ccode\u003efor\u003c/code\u003e loop, \u003ccode\u003esize_t\u003c/code\u003e is often safer and closer to the intended meaning than \u003ccode\u003eint\u003c/code\u003e. This post uses the ACERS structure to explain what \u003ccode\u003esize_t\u003c/code\u003e is, why it is used, the common pitfalls, and practical patterns for production C++.\u003c/p\u003e\n\u003ch3 id=\"meta\"\u003eMeta\u003c/h3\u003e\n\u003cul\u003e\n\u003cli\u003eReading time: 8-10 minutes\u003c/li\u003e\n\u003cli\u003eTags: C++, size_t, type system, loops, STL\u003c/li\u003e\n\u003cli\u003eSEO keywords: size_t usage, size_t vs int, C++ loop initialization, size_t underflow\u003c/li\u003e\n\u003cli\u003eMeta description: Explain size_t and why loops often use it, with safe patterns and engineering scenarios.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch3 id=\"target-readers\"\u003eTarget readers\u003c/h3\u003e\n\u003cul\u003e\n\u003cli\u003eC++ beginners who are new to \u003ccode\u003esize_t\u003c/code\u003e, \u003ccode\u003esizeof\u003c/code\u003e, and container \u003ccode\u003esize()\u003c/code\u003e return types\u003c/li\u003e\n\u003cli\u003eMid-level engineers who have seen \u003ccode\u003e-Wsign-compare\u003c/code\u003e warnings or unsigned underflow bugs\u003c/li\u003e\n\u003cli\u003eEngineers writing cross-platform or high-performance C++\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch3 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h3\u003e\n\u003cp\u003eIn C++ code, you often see loops like:\u003c/p\u003e","title":"What Is size_t? Why C++ Loops Prefer size_t Over int"},{"content":" Subtitle / Abstract A basic counting problem: use frequency + combinations to drop O(n^2) to O(n). Includes engineering use cases and portable implementations.\nReading time: 8-10 minutes Tags: hash-table, counting, array SEO keywords: Good Pairs, hash map, frequency Meta description: Hash counting solution for Good Pairs with complexity and code. Target readers Beginners learning hash tables and counting Engineers who want to map interview patterns to real stats tasks Interview prep for basic counting models Background / Motivation Counting equal pairs is a classic problem. A double loop is O(n^2). With frequency counting, you can solve it in linear time and scale to large data.\nA - Algorithm (Problem and approach) Problem Given an integer array nums, a pair (i, j) is a good pair if nums[i] == nums[j] and i \u0026lt; j. Return the number of good pairs.\nInput/Output Name Type Description nums int[] integer array return int number of good pairs Examples nums output notes [1, 2, 3, 1, 1, 3] 4 (0,3) (0,4) (3,4) (2,5) [1, 1, 1, 1] 6 C(4,2) = 6 [1, 2, 3] 0 no duplicates Simple intuition:\nValue 1 appears 3 times -\u0026gt; C(3,2)=3 Value 3 appears 2 times -\u0026gt; C(2,2)=1 Total = 4 C - Concepts (Core ideas) Frequency count: count occurrences of each value Combinations: if value appears c times, pairs = c*(c-1)/2 Hash table: O(1) average update Key formula:\nFor each value v with count c: Pairs = c * (c - 1) / 2 One-pass model:\nans += count[nums[i]] count[nums[i]] += 1 Practical steps Initialize count and ans = 0 For each element x, add count[x] to ans Increment count[x] E - Engineering (Real-world usage) Scenario 1: Data quality scoring (Python) Duplicate-pair score for a column:\ndef duplicate_pair_score(values): count = {} score = 0 for v in values: score += count.get(v, 0) count[v] = count.get(v, 0) + 1 return score print(duplicate_pair_score([\u0026#34;A\u0026#34;, \u0026#34;B\u0026#34;, \u0026#34;A\u0026#34;, \u0026#34;C\u0026#34;, \u0026#34;A\u0026#34;])) Scenario 2: Batch task dedup weight (Go) package main import \u0026#34;fmt\u0026#34; func goodPairs(ids []int) int { count := map[int]int{} ans := 0 for _, id := range ids { ans += count[id] count[id]++ } return ans } func main() { fmt.Println(goodPairs([]int{7, 7, 8, 9, 7})) } Scenario 3: Frontend duplicate warning (JS) function goodPairs(items) { const count = new Map(); let ans = 0; for (const x of items) { ans += count.get(x) || 0; count.set(x, (count.get(x) || 0) + 1); } return ans; } console.log(goodPairs([\u0026#34;u1\u0026#34;, \u0026#34;u2\u0026#34;, \u0026#34;u1\u0026#34;, \u0026#34;u1\u0026#34;])); R - Reflection Complexity Time: O(n) Space: O(n) Alternatives Approach Time Space Notes double loop O(n^2) O(1) simple but slow sort and group O(n log n) O(1) changes order hash count (one pass) O(n) O(n) fastest in practice Pitfalls Add count[x] before incrementing to avoid self-pairing Use 64-bit integers if counts are large Pre-size hash map if possible S - Summary Good pairs equal combinations of equal values Hash counting drops O(n^2) to O(n) One-pass counting is clean and safe This model transfers to dedup stats, quality scoring, and logs Conclusion Good pairs are a deceptively simple counting problem. Once you master hash counting, many similar tasks become trivial.\nReferences https://leetcode.com/problems/number-of-good-pairs/ https://en.wikipedia.org/wiki/Combination https://docs.python.org/3/library/stdtypes.html#mapping-types-dict https://en.cppreference.com/w/cpp/container/unordered_map https://doc.rust-lang.org/std/collections/struct.HashMap.html Call to Action (CTA) Use this counting model as a base and adapt it to three-sum variants or grouped stats. Share your approach in comments.\nMulti-language reference implementations from typing import List def num_identical_pairs(nums: List[int]) -\u0026gt; int: count = {} ans = 0 for x in nums: ans += count.get(x, 0) count[x] = count.get(x, 0) + 1 return ans if __name__ == \u0026#34;__main__\u0026#34;: print(num_identical_pairs([1, 2, 3, 1, 1, 3])) #include \u0026lt;stdint.h\u0026gt; #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; typedef struct { int key; int count; int used; } Entry; static unsigned hash_int(int key) { return (uint32_t)key * 2654435761u; } static int find_slot(Entry *table, int cap, int key, int *found) { unsigned mask = (unsigned)cap - 1u; unsigned idx = hash_int(key) \u0026amp; mask; while (table[idx].used \u0026amp;\u0026amp; table[idx].key != key) { idx = (idx + 1u) \u0026amp; mask; } *found = table[idx].used \u0026amp;\u0026amp; table[idx].key == key; return (int)idx; } long long num_identical_pairs(const int *nums, int n) { int cap = 1; while (cap \u0026lt; n * 2) cap \u0026lt;\u0026lt;= 1; if (cap \u0026lt; 2) cap = 2; Entry *table = (Entry *)calloc((size_t)cap, sizeof(Entry)); if (!table) return 0; long long ans = 0; for (int i = 0; i \u0026lt; n; ++i) { int found = 0; int pos = find_slot(table, cap, nums[i], \u0026amp;found); if (found) { ans += table[pos].count; table[pos].count += 1; } else { table[pos].used = 1; table[pos].key = nums[i]; table[pos].count = 1; } } free(table); return ans; } int main(void) { int nums[] = {1, 2, 3, 1, 1, 3}; printf(\u0026#34;%lld\\n\u0026#34;, num_identical_pairs(nums, 6)); return 0; } #include \u0026lt;iostream\u0026gt; #include \u0026lt;unordered_map\u0026gt; #include \u0026lt;vector\u0026gt; long long num_identical_pairs(const std::vector\u0026lt;int\u0026gt; \u0026amp;nums) { std::unordered_map\u0026lt;int, long long\u0026gt; count; long long ans = 0; for (int x : nums) { ans += count[x]; count[x] += 1; } return ans; } int main() { std::vector\u0026lt;int\u0026gt; nums{1, 2, 3, 1, 1, 3}; std::cout \u0026lt;\u0026lt; num_identical_pairs(nums) \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 0; } package main import \u0026#34;fmt\u0026#34; func numIdenticalPairs(nums []int) int64 { count := map[int]int64{} var ans int64 = 0 for _, x := range nums { ans += count[x] count[x]++ } return ans } func main() { fmt.Println(numIdenticalPairs([]int{1, 2, 3, 1, 1, 3})) } use std::collections::HashMap; fn num_identical_pairs(nums: \u0026amp;[i32]) -\u0026gt; i64 { let mut count: HashMap\u0026lt;i32, i64\u0026gt; = HashMap::new(); let mut ans: i64 = 0; for \u0026amp;x in nums { let c = *count.get(\u0026amp;x).unwrap_or(\u0026amp;0); ans += c; count.insert(x, c + 1); } ans } fn main() { let nums = vec![1, 2, 3, 1, 1, 3]; println!(\u0026#34;{}\u0026#34;, num_identical_pairs(\u0026amp;nums)); } function numIdenticalPairs(nums) { const count = new Map(); let ans = 0; for (const x of nums) { ans += count.get(x) || 0; count.set(x, (count.get(x) || 0) + 1); } return ans; } console.log(numIdenticalPairs([1, 2, 3, 1, 1, 3])); ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/1512-number-of-good-pairs/","summary":"\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\nA basic counting problem: use frequency + combinations to drop O(n^2) to O(n). Includes engineering use cases and portable implementations.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 8-10 minutes\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTags\u003c/strong\u003e: \u003ccode\u003ehash-table\u003c/code\u003e, \u003ccode\u003ecounting\u003c/code\u003e, \u003ccode\u003earray\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSEO keywords\u003c/strong\u003e: Good Pairs, hash map, frequency\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMeta description\u003c/strong\u003e: Hash counting solution for Good Pairs with complexity and code.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eBeginners learning hash tables and counting\u003c/li\u003e\n\u003cli\u003eEngineers who want to map interview patterns to real stats tasks\u003c/li\u003e\n\u003cli\u003eInterview prep for basic counting models\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eCounting equal pairs is a classic problem. A double loop is O(n^2). With frequency counting, you can solve it in linear time and scale to large data.\u003c/p\u003e","title":"LeetCode 1512: Number of Good Pairs (Hash Counting ACERS Guide)"},{"content":"XOR and RC4: From Principles to Go Practice (with Safer Alternatives) Subtitle / Abstract Use minimal math to explain XOR and RC4, provide runnable Go examples, and clarify why RC4 is considered insecure with recommended alternatives.\nTarget readers Backend engineers reading legacy RC4 code Beginners who confuse encoding and encryption Intermediate developers building a stream-cipher mental model Background / Motivation Many systems still contain RC4 or custom decryption logic. Common mistakes include treating Base64 as encryption and ignoring integrity checks. Understanding XOR and RC4 helps you evaluate security correctly and avoid copying outdated designs into new systems.\nCore concepts XOR: bitwise operation, reversible Stream cipher: XOR a pseudorandom keystream with plaintext bytes RC4: classic stream cipher, no longer recommended Base64: encoding, not encryption Integrity: encryption alone does not prevent tampering Practical steps Receive a Base64 string (often RC4 output) Decode Base64 to raw bytes Initialize RC4 with a shared key XOR keystream with bytes Convert output to UTF-8 text if it is textual Runnable example (Go) package main import ( \u0026#34;crypto/rc4\u0026#34; \u0026#34;encoding/base64\u0026#34; \u0026#34;fmt\u0026#34; ) func rc4XOR(key string, data []byte) ([]byte, error) { c, err := rc4.NewCipher([]byte(key)) if err != nil { return nil, err } out := make([]byte, len(data)) c.XORKeyStream(out, data) return out, nil } func encryptToBase64RC4(key, plaintext string) (string, error) { out, err := rc4XOR(key, []byte(plaintext)) if err != nil { return \u0026#34;\u0026#34;, err } return base64.StdEncoding.EncodeToString(out), nil } func decryptBase64RC4(key, encoded string) (string, error) { raw, err := base64.StdEncoding.DecodeString(encoded) if err != nil { return \u0026#34;\u0026#34;, err } out, err := rc4XOR(key, raw) if err != nil { return \u0026#34;\u0026#34;, err } return string(out), nil } func main() { key := \u0026#34;demo-key\u0026#34; plaintext := \u0026#34;hello rc4\u0026#34; enc, _ := encryptToBase64RC4(key, plaintext) dec, _ := decryptBase64RC4(key, enc) fmt.Println(enc) fmt.Println(dec) } Run:\ngo run rc4_demo.go Explanation XOR is reversible because a XOR b XOR b = a. RC4 generates a pseudorandom keystream and XORs it with data byte by byte. Since encryption and decryption use the same keystream, keystream reuse or bias can leak information.\nCommon pitfalls Base64 is encoding, not encryption RC4 has known biases and is deprecated Encryption alone does not provide integrity; use MAC or AEAD Reusing keys can reveal plaintext Best practices Use AES-GCM or ChaCha20-Poly1305 for new systems Migrate legacy RC4 systems as soon as possible Consider confidentiality and integrity together Conclusion XOR is the core operation behind stream ciphers. RC4 is easy to understand but unsafe; it is suitable for reading legacy code, not new design. Modern systems should use AEAD algorithms instead.\nReferences https://www.rfc-editor.org/rfc/rfc6229 https://www.rfc-editor.org/rfc/rfc7465 https://en.wikipedia.org/wiki/RC4 https://pkg.go.dev/crypto/rc4 Meta Reading time: 8 minutes Tags: go, security, crypto, rc4, xor SEO keywords: XOR, RC4, stream cipher, Go, encryption, Base64 Meta description: Explain XOR and RC4 with runnable Go examples and why RC4 is no longer secure. Call to Action (CTA) After running the demo, replace RC4 with AES-GCM and document the differences and migration cost for your team.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/go/xor-rec4-primer/","summary":"\u003ch1 id=\"xor-and-rc4-from-principles-to-go-practice-with-safer-alternatives\"\u003eXOR and RC4: From Principles to Go Practice (with Safer Alternatives)\u003c/h1\u003e\n\u003ch2 id=\"subtitle--abstract\"\u003eSubtitle / Abstract\u003c/h2\u003e\n\u003cp\u003eUse minimal math to explain XOR and RC4, provide runnable Go examples, and clarify why RC4 is considered insecure with recommended alternatives.\u003c/p\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eBackend engineers reading legacy RC4 code\u003c/li\u003e\n\u003cli\u003eBeginners who confuse encoding and encryption\u003c/li\u003e\n\u003cli\u003eIntermediate developers building a stream-cipher mental model\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMany systems still contain RC4 or custom decryption logic. Common mistakes include treating Base64 as encryption and ignoring integrity checks. Understanding XOR and RC4 helps you evaluate security correctly and avoid copying outdated designs into new systems.\u003c/p\u003e","title":"XOR and RC4: From Principles to Go Practice (with Safer Alternatives)"},{"content":" Final post of the series: put all algorithms into one decision framework so you can quickly choose what to use, why it fits, and how to validate in real projects.\nTarget Readers Engineers who need to justify sort choices in projects/interviews/presentations. People who want a practical cheat sheet using size, distribution, stability, and memory. Background and Motivation Too many algorithms lead to \u0026ldquo;default quicksort\u0026rdquo; or \u0026ldquo;blind stability\u0026rdquo; without a framework. This post offers an actionable selection table + test checklist covering built-ins, non-comparison, external sort, and hybrids. A - Algorithm (Theme and Quick Reference) Core question: Choose a sorting strategy under different constraints.\nQuick reference (priority suggestions)\nStable + nearly sorted: TimSort / merge (Python/Java default). In-place + worst-case bound: Introsort (C++ std::sort idea) / heap sort. Known range/digits: counting/bucket/radix. Small/near-sorted: insertion; also as a hybrid subroutine. External sort (beyond RAM): chunk + multi-way merge (stable). Teaching/demo: bubble/selection/insertion to show stability and swap cost. C - Concepts (Core Dimensions) Dimension Focus Algorithms Time complexity average/worst quick/Introsort/heap/merge/TimSort/non-comparison Space in-place vs O(n) quick/heap/Introsort in-place; merge/TimSort/counting/radix need extra space Stability preserve relative order merge/TimSort/insertion/counting/radix; quick/heap/selection/shell are unstable Data characteristics size/order/range near-sorted -\u0026gt; TimSort/insertion; known range -\u0026gt; counting/radix; large random -\u0026gt; Introsort/quick Environment memory/external storage memory tight -\u0026gt; in-place; beyond RAM -\u0026gt; external merge E - Engineering Scenarios Scenario 1: API pagination sort (Go) Need: mid-size, no stability, memory tight. Choice: sort.Slice (Introsort idea), insertion for small segments. Validate: reverse and heavy-duplicate cases, check for degeneration and time. Scenario 2: Log batch processing (Python) Need: stable, near-sorted (by time bucket). Choice: built-in TimSort. Validate: local inversions; verify stability preserves order. Scenario 3: Large file sorting (C++) Need: data exceeds RAM, stable. Choice: external sort (chunk sort + k-way merge). Validate: chunk size vs I/O; min-heap merge; ensure stable merge. Scenario 4: Known-range integer batches (Go) Need: small range, speed. Choice: counting or radix; if range large but digits limited, use radix. Validate: estimate k vs n; stress test extremes. Scenario 5: Frontend stable table sort (JavaScript) Need: stable multi-key order. Choice: browser built-in (usually stable) or custom stable merge/TimSort; if unsure, map index to preserve stability. R - Reflection Time/space trade-off: in-place but unstable (quick/heap/Introsort) vs stable with extra memory (merge/TimSort/counting/radix). Worst-case guarantees: Introsort/heap/merge have bounds; quicksort needs anti-degeneration strategy; TimSort worst-case is still O(n log n). Data characteristics: known range/digits make non-comparison a big win; near-sorted favors TimSort/insertion. External sorting: I/O dominated; focus on chunk size, merge fan-in, and temp files. S - Summary Ask four questions first: size/distribution? stability? memory/external? range/digits? Built-in sorts are often enough: Python/Java stable TimSort; C++/Go Introsort-like unstable; customize only when needed. Non-comparison sorts are powerful under bounded range/digits; external sort handles beyond-RAM data. Hybrid strategies are the norm: insertion for small segments, heap fallback on depth, run detection + merge. Practice Guide / Steps Write a selection table: scenario -\u0026gt; requirements -\u0026gt; choice -\u0026gt; rationale. Benchmark on six datasets: random, reverse, nearly sorted, heavy duplicates, range-bounded, beyond-RAM. Add monitoring: sort time, comparisons (if measurable), memory; for external sorts track I/O. Require a \u0026ldquo;sort algorithm + rationale\u0026rdquo; field in PRs or design docs. Common Pitfalls and Notes Ignoring stability when business depends on relative order; use stable sort or index mapping. Underestimating memory: counting/radix can explode; external sort needs temp storage planning. Pivot degeneration: custom quicksort needs random/median-of-three + insertion threshold + tail recursion. Using quicksort on nearly sorted data: TimSort/insertion may be faster. Runnable Example: Simple Selection Function (Python) def choose_sort(stable: bool, n: int, range_known=False, near_sorted=False): if range_known: return \u0026#34;counting/radix\u0026#34; if stable else \u0026#34;counting/radix\u0026#34; if stable: if n \u0026gt; 5e5: return \u0026#34;merge/timsort\u0026#34; return \u0026#34;timsort\u0026#34; if near_sorted and n \u0026lt; 1e4: return \u0026#34;insertion\u0026#34; if n \u0026gt; 1e6: return \u0026#34;introsort/heap\u0026#34; return \u0026#34;introsort/quicksort\u0026#34; print(choose_sort(stable=True, n=10000, range_known=False, near_sorted=True)) References and Further Reading The previous 7 posts in this series: O(n^2) baselines, shell, merge, quick, heap, non-comparison, TimSort/Introsort. CLRS sorting chapters; Bentley \u0026amp; McIlroy \u0026ldquo;Engineering a Sort Function\u0026rdquo;. Meta Reading time: approx. 12 min SEO keywords: sorting selection, stable sort, external sort, TimSort, Introsort Meta description: sorting series finale with decision tables by scale/distribution/stability/memory plus testing guidance for real projects. Call to Action (CTA) Fill out a \u0026ldquo;sorting selection table\u0026rdquo; for your project with scenario/requirements/algorithm/rationale. Run benchmarks on six data distributions and record time/memory to validate your choice. If you need external sorting, build a chunk + merge PoC and measure I/O and storage costs. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/9.sorting-series-selection-guide/","summary":"Practical selection guide: decision tables by scale/distribution/stability/memory, engineering scenarios, test checklist, and common pitfalls to apply the series.","title":"Sorting Series (Final): Practical Selection - Choose by Scale, Stability, Memory, Distribution"},{"content":" Core idea: even with AI, you should be able to implement critical paths offline. AI accelerates; it does not replace thinking. This post combines learning science and practical workflows with a self-checklist.\nTarget readers Mid to senior engineers and tech leads who want AI speed without losing control. Team leads adopting AI-assisted coding or documentation. Engineers who already work with Git, tests, and code reviews. Background and motivation Pain points: Copy-pasting model output without understanding leads to fragile code and hard debugging. Over-reliance on prompts reduces independent implementation ability. Architecture and security decisions get driven by the model instead of the engineer. Goals: Implement critical paths from scratch without AI when needed. Use AI for validation and refactoring, not for blind generation. Build a \u0026ldquo;think first, verify later\u0026rdquo; workflow. Core concepts Feynman technique: if you can explain it simply, you understand it. Deliberate practice: target weak points with feedback and challenge. Retrieval practice: recall and derive before checking answers. Red/blue mode with AI: human writes first (blue), AI critiques (red). Replaceability: can you replace the model and still ship the feature? Practical steps Write a human plan first, then ask AI Sketch interfaces, flow, and edge cases before prompting. Limit copy/paste; hand-type key logic Routes, migrations, permissions should be typed by you; AI can review. Side-by-side comparison Left: your solution, right: AI suggestions. Keep only what you can explain. Retrieval practice loop Implement without AI, then compare with AI, mark blind spots, rewrite once. Feynman output Summarize in 3-5 sentences; if you cannot, study again. Runnable micro-exercise Implement a unique function that preserves order:\ndef unique_keep_order(items): seen = set() result = [] for x in items: if x in seen: continue seen.add(x) result.append(x) return result assert unique_keep_order([1, 2, 2, 3]) == [1, 2, 3] Exercise flow:\nRound 1: no AI, implement and test; note gaps. Round 2: compare with AI, check edge cases (e.g., unhashable items). Round 3: explain complexity and limits to a teammate or in a recording. Explanation Why limit copy/paste? It skips the recall-derive-verify loop and makes understanding shallow. Hand-typing exposes gaps in API knowledge and naming. Trade-offs Fully manual: safest but slow; use for security-critical modules. AI review: faster but needs human design and merge. AI scaffolding: good for kickstart, but requires tests and refactoring. Common questions How to avoid prompt dependence? Write pseudocode and tests first, then ask AI. What if time is tight? Ask AI for checklists or tests; you implement the core. How to prove you are not being driven? Document your decisions and reasons. Security/compliance: never paste secrets; use local or private models if needed. Best practices Weekly: rewrite a core path without AI (auth, billing, migrations). Add PR template fields: what decisions were made by humans vs AI. Use TDD: write tests first, ask AI for edge-case tests only. Keep the \u0026ldquo;explainability\u0026rdquo; rule: if you cannot explain it in 3 sentences, rework. Track blind spots and practice deliberately. Conclusion AI is a multiplier, not a driver. Keep replaceability and explanation as your safety belt. Use Feynman + deliberate practice + retrieval practice to lock in understanding. References Richard Feynman, \u0026ldquo;The Feynman Technique\u0026rdquo; Anders Ericsson, \u0026ldquo;Peak\u0026rdquo; Roediger \u0026amp; Karpicke, \u0026ldquo;Test-Enhanced Learning\u0026rdquo; Thoughtworks Technology Radar (AI-assisted coding) Meta Reading time: about 9 minutes Tags: AI assistant, engineering practice, learning methods SEO keywords: AI dependence, engineering autonomy, deliberate practice, Feynman learning, AI code review Updated: 2025-11-14 Call to Action (CTA) Pick one critical module, hand-write it, then use AI to review and record the diff. Add an \u0026ldquo;AI assistance scope\u0026rdquo; field to your PR template. Share your \u0026ldquo;no-AI rewrite\u0026rdquo; experiences and learnings. ","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/ai-usage-self-control/","summary":"How to avoid copy-paste dependence when using AI for coding: Feynman technique, deliberate practice, retrieval practice, and a practical self-check workflow.","title":"Do Not Let AI Drive You: Keep the Ability to Build Independently"},{"content":" A practical emmet-vim handbook for developers who live in Vim/Neovim but feel HTML/CSS is slow: fast install, must-know shortcuts, minimal runnable examples, and a validation/troubleshooting checklist.\nReader profile and prerequisites Frontend or full-stack engineers using Vim/Neovim for UI work. Comfortable with basic HTML/CSS and editing ~/.vimrc or init.lua. Suggested environment: Vim 8.2+ with +python3 or Neovim 0.7+; Git installed; Homebrew/Apt available. Background and problem Scenario: typing \u0026lt;div class=\u0026quot;card\u0026quot;\u0026gt;\u0026lt;img ...\u0026gt; by hand is slow and error-prone. Pain points: Repetitive HTML/CSS blocks break flow. Managing tag closures and nesting is easy to mess up. VS Code has Emmet built in; Vim lacks comparable speed without a plugin. Goal: expand a full structure in a few keystrokes; example input ul.list\u0026gt;li.item$*3\u0026gt;a{click} should expand correctly, with reliable shortcuts and configurable behavior. Core concepts Abbreviation: ul\u0026gt;li*3 expands to a full tag tree with one shortcut. Trigger key: emmet-vim default is \u0026lt;C-y\u0026gt;, (Ctrl+y then comma); \u0026lt;C-y\u0026gt;d balances/wraps tags. Context aware: in CSS, m10-20 expands to margin: 10px 20px;; in HTML, it builds tags. Numbering with $: li.item$*3 creates item1/2/3; ${} supports placeholders. Environment and dependencies Vim 8.2+ with :echo has('python3') returning 1, or Neovim 0.7+. Python 3.8+ (python3 --version) used by the Emmet engine. Any plugin manager: vim-plug, dein, lazy.nvim, packer.nvim. Optional: Node 18+ for other Emmet CLI tools (not required for emmet-vim). Typical install (vim-plug): \u0026#34; ~/.vimrc or init.vim call plug#begin(\u0026#39;~/.vim/plugged\u0026#39;) Plug \u0026#39;mattn/emmet-vim\u0026#39; call plug#end() let g:user_emmet_leader_key=\u0026#39;,\u0026#39; \u0026#34; customize leader; default is \u0026lt;C-y\u0026gt; Run :PlugInstall in Vim after setup.\nPractical steps (copy-ready) 1) Verify Python support :echo has(\u0026#39;python3\u0026#39;) Expected output is 1. If not, install a Vim build with Python3 or configure Neovim provider.\n2) Configure basic key bindings \u0026#34; Make Emmet trigger shorter: use comma as leader let g:user_emmet_leader_key=\u0026#39;,\u0026#39; \u0026#34; Enable in HTML/CSS/JSX let g:user_emmet_settings = { \\ \u0026#39;javascript.jsx\u0026#39; : { \\ \u0026#39;extends\u0026#39; : \u0026#39;html\u0026#39; \\ } \\} Expected: in HTML/JSX, type an abbreviation and press ,+, or ,+; (same as \u0026lt;C-y\u0026gt;,).\n3) HTML list example Input:\nul.list\u0026gt;li.item$*3\u0026gt;a{click me} Press ,+, to expand:\n\u0026lt;ul class=\u0026#34;list\u0026#34;\u0026gt; \u0026lt;li class=\u0026#34;item1\u0026#34;\u0026gt;\u0026lt;a href=\u0026#34;\u0026#34;\u0026gt;click me\u0026lt;/a\u0026gt;\u0026lt;/li\u0026gt; \u0026lt;li class=\u0026#34;item2\u0026#34;\u0026gt;\u0026lt;a href=\u0026#34;\u0026#34;\u0026gt;click me\u0026lt;/a\u0026gt;\u0026lt;/li\u0026gt; \u0026lt;li class=\u0026#34;item3\u0026#34;\u0026gt;\u0026lt;a href=\u0026#34;\u0026#34;\u0026gt;click me\u0026lt;/a\u0026gt;\u0026lt;/li\u0026gt; \u0026lt;/ul\u0026gt; 4) Wrap or rebalance tags Select text, type ul\u0026gt;li*, press ,+w (Wrap with abbreviation) to wrap it in a list. On a tag, press ,+d to balance select the parent and quickly rearrange. 5) CSS abbreviations Input: p10-20 bgc#0f172a c#e2e8f0 then trigger:\npadding: 10px 20px; background-color: #0f172a; color: #e2e8f0; 6) JSX/TSX usage Extend javascriptreact / typescriptreact in g:user_emmet_settings. In JSX, input Button.primary\u0026gt;{Submit} then trigger: \u0026lt;Button className=\u0026#34;primary\u0026#34;\u0026gt;Submit\u0026lt;/Button\u0026gt; Make sure filetype is javascriptreact/typescriptreact.\nMore frequent snippets (ready to paste) 1) Semantic page shell + top nav Input:\nheader.site\u0026gt;div.container\u0026gt;h1.logo{Brand}+nav\u0026gt;ul\u0026gt;li*3\u0026gt;a{Nav $}+button.btn.primary{Sign up} Output:\n\u0026lt;header class=\u0026#34;site\u0026#34;\u0026gt; \u0026lt;div class=\u0026#34;container\u0026#34;\u0026gt; \u0026lt;h1 class=\u0026#34;logo\u0026#34;\u0026gt;Brand\u0026lt;/h1\u0026gt; \u0026lt;nav\u0026gt; \u0026lt;ul\u0026gt; \u0026lt;li\u0026gt;\u0026lt;a href=\u0026#34;\u0026#34;\u0026gt;Nav 1\u0026lt;/a\u0026gt;\u0026lt;/li\u0026gt; \u0026lt;li\u0026gt;\u0026lt;a href=\u0026#34;\u0026#34;\u0026gt;Nav 2\u0026lt;/a\u0026gt;\u0026lt;/li\u0026gt; \u0026lt;li\u0026gt;\u0026lt;a href=\u0026#34;\u0026#34;\u0026gt;Nav 3\u0026lt;/a\u0026gt;\u0026lt;/li\u0026gt; \u0026lt;/ul\u0026gt; \u0026lt;/nav\u0026gt; \u0026lt;button class=\u0026#34;btn primary\u0026#34;\u0026gt;Sign up\u0026lt;/button\u0026gt; \u0026lt;/div\u0026gt; \u0026lt;/header\u0026gt; 2) Form with labels and submit Input:\nform#contact\u0026gt;label[for=name]{Name}+input#name[type=text required placeholder=Your name]+label[for=email]{Email}+input#email[type=email required placeholder=hi@example.com]+button.btn[type=submit]{Send} Output:\n\u0026lt;form id=\u0026#34;contact\u0026#34;\u0026gt; \u0026lt;label for=\u0026#34;name\u0026#34;\u0026gt;Name\u0026lt;/label\u0026gt; \u0026lt;input id=\u0026#34;name\u0026#34; type=\u0026#34;text\u0026#34; required placeholder=\u0026#34;Your name\u0026#34;\u0026gt; \u0026lt;label for=\u0026#34;email\u0026#34;\u0026gt;Email\u0026lt;/label\u0026gt; \u0026lt;input id=\u0026#34;email\u0026#34; type=\u0026#34;email\u0026#34; required placeholder=\u0026#34;hi@example.com\u0026#34;\u0026gt; \u0026lt;button class=\u0026#34;btn\u0026#34; type=\u0026#34;submit\u0026#34;\u0026gt;Send\u0026lt;/button\u0026gt; \u0026lt;/form\u0026gt; 3) Card grid (blog/product list) Input:\nsection.blog\u0026gt;h2{Latest Posts}+div.grid\u0026gt;article.card$*3\u0026gt;img[alt=thumb$ src=/img/thumb$.jpg]+h3{Post $}+p{Short teaser}+a.read[href=/post$]{Read more} Output (excerpt):\n\u0026lt;section class=\u0026#34;blog\u0026#34;\u0026gt; \u0026lt;h2\u0026gt;Latest Posts\u0026lt;/h2\u0026gt; \u0026lt;div class=\u0026#34;grid\u0026#34;\u0026gt; \u0026lt;article class=\u0026#34;card1\u0026#34;\u0026gt; \u0026lt;img alt=\u0026#34;thumb1\u0026#34; src=\u0026#34;/img/thumb1.jpg\u0026#34;\u0026gt; \u0026lt;h3\u0026gt;Post 1\u0026lt;/h3\u0026gt; \u0026lt;p\u0026gt;Short teaser\u0026lt;/p\u0026gt; \u0026lt;a class=\u0026#34;read\u0026#34; href=\u0026#34;/post1\u0026#34;\u0026gt;Read more\u0026lt;/a\u0026gt; \u0026lt;/article\u0026gt; ... \u0026lt;/div\u0026gt; \u0026lt;/section\u0026gt; 4) Table with auto numbering Input:\ntable.table\u0026gt;thead\u0026gt;tr\u0026gt;th*3{Col $}+tbody\u0026gt;tr*3\u0026gt;td{Row $ Col 1}+td{Row $ Col 2}+td{Row $ Col 3} Output:\n\u0026lt;table class=\u0026#34;table\u0026#34;\u0026gt; \u0026lt;thead\u0026gt; \u0026lt;tr\u0026gt; \u0026lt;th\u0026gt;Col 1\u0026lt;/th\u0026gt; \u0026lt;th\u0026gt;Col 2\u0026lt;/th\u0026gt; \u0026lt;th\u0026gt;Col 3\u0026lt;/th\u0026gt; \u0026lt;/tr\u0026gt; \u0026lt;/thead\u0026gt; \u0026lt;tbody\u0026gt; \u0026lt;tr\u0026gt; \u0026lt;td\u0026gt;Row 1 Col 1\u0026lt;/td\u0026gt; \u0026lt;td\u0026gt;Row 1 Col 2\u0026lt;/td\u0026gt; \u0026lt;td\u0026gt;Row 1 Col 3\u0026lt;/td\u0026gt; \u0026lt;/tr\u0026gt; \u0026lt;tr\u0026gt; \u0026lt;td\u0026gt;Row 2 Col 1\u0026lt;/td\u0026gt; \u0026lt;td\u0026gt;Row 2 Col 2\u0026lt;/td\u0026gt; \u0026lt;td\u0026gt;Row 2 Col 3\u0026lt;/td\u0026gt; \u0026lt;/tr\u0026gt; \u0026lt;tr\u0026gt; \u0026lt;td\u0026gt;Row 3 Col 1\u0026lt;/td\u0026gt; \u0026lt;td\u0026gt;Row 3 Col 2\u0026lt;/td\u0026gt; \u0026lt;td\u0026gt;Row 3 Col 3\u0026lt;/td\u0026gt; \u0026lt;/tr\u0026gt; \u0026lt;/tbody\u0026gt; \u0026lt;/table\u0026gt; 5) JSX/TSX component snippet Input:\nCard\u0026gt;Image[src=/hero.png alt=Hero aria-label=Hero]+h3{Landing}+p{Faster HTML}+Button.primary{Get started} Expanded in React/TSX:\n\u0026lt;Card\u0026gt; \u0026lt;Image src=\u0026#34;/hero.png\u0026#34; alt=\u0026#34;Hero\u0026#34; aria-label=\u0026#34;Hero\u0026#34; /\u0026gt; \u0026lt;h3\u0026gt;Landing\u0026lt;/h3\u0026gt; \u0026lt;p\u0026gt;Faster HTML\u0026lt;/p\u0026gt; \u0026lt;Button className=\u0026#34;primary\u0026#34;\u0026gt;Get started\u0026lt;/Button\u0026gt; \u0026lt;/Card\u0026gt; 6) CSS quick combo (Emmet CSS syntax) Input:\nd:f ai:c jc:sb g:16 p:16 m:0 bdrs:12px bgc:#0f172a c:#e2e8f0 Output:\ndisplay: flex; align-items: center; justify-content: space-between; gap: 16px; padding: 16px; margin: 0; border-radius: 12px; background-color: #0f172a; color: #e2e8f0; 7) Wrap with abbreviation examples Select lines \u0026ldquo;Item A\u0026rdquo; and \u0026ldquo;Item B\u0026rdquo;, type ul.list\u0026gt;li*, press ,+w: \u0026lt;ul class=\u0026#34;list\u0026#34;\u0026gt; \u0026lt;li\u0026gt;Item A\u0026lt;/li\u0026gt; \u0026lt;li\u0026gt;Item B\u0026lt;/li\u0026gt; \u0026lt;/ul\u0026gt; Great for converting raw text to a list or card container. Minimal runnable demo (local) Create demo.html: \u0026lt;!doctype html\u0026gt; \u0026lt;html\u0026gt; \u0026lt;head\u0026gt;\u0026lt;meta charset=\u0026#34;UTF-8\u0026#34;\u0026gt;\u0026lt;title\u0026gt;Emmet Demo\u0026lt;/title\u0026gt;\u0026lt;/head\u0026gt; \u0026lt;body\u0026gt; \u0026lt;!-- type emmet abbreviations here and trigger expand --\u0026gt; \u0026lt;/body\u0026gt; \u0026lt;/html\u0026gt; Open in Vim, enter section.hero\u0026gt;h1{Hello}+p{Speed up with emmet-vim}+ul.features\u0026gt;li.feature$*3 inside \u0026lt;body\u0026gt;. Trigger expansion, save with :w, open in browser to see the result. Trade-offs and choices emmet-vim in Vim vs Emmet via LSP/completion: the former is dependency-free and instant; the latter may need Node or a server but integrates with completions. Trigger key: default \u0026lt;C-y\u0026gt; avoids conflicts but is two keystrokes; , or \u0026lt;C-e\u0026gt; is faster but can conflict with other plugins. Formatting: emmet-vim does not format; run Prettier/ESLint on output if needed. Common pitfalls and FAQ Not working: has('python3') is 0, wrong filetype, or :PlugInstall not run. JSX expands with HTML attrs: ensure javascript.jsx/javascriptreact extends html; set filetype if needed. Key conflicts: check mappings with :verbose imap , , and rebind. Multi-cursor: emmet-vim does not support it natively; use vim-visual-multi and trigger after inserting abbreviations. Performance: large expansions can be slow; use on component snippets instead of massive trees. Test and validation checklist :echo has('python3') == 1. In HTML buffer, type div#app\u0026gt;header\u0026gt;h1{Hi}+nav\u0026gt;ul\u0026gt;li*3\u0026gt;a{link$} and expand correctly. In CSS buffer, m10-20 and bgc#333 expand to valid declarations. In JSX buffer, Card\u0026gt;Button.primary{Go} expands to \u0026lt;Card\u0026gt;\u0026lt;Button className=\u0026quot;primary\u0026quot;\u0026gt;Go\u0026lt;/Button\u0026gt;\u0026lt;/Card\u0026gt;. No errors in :messages; trigger key not overridden. Performance and accessibility Prefer semantic tags (header/nav/main/section) for screen readers and SEO. Always add alt for images: img[alt=avatar src=/avatar.png]. Add aria-label placeholders for icon-only buttons. Emmet does not impact performance metrics directly, but avoid unnecessary nesting. Best practices Configure g:user_emmet_settings explicitly per filetype for HTML/JSX/TSX consistency. Customize the leader (e.g., ,) and sync it across machines via dotfiles. Combine with formatters (Prettier/StyLua/ESLint) to normalize style on save. Write a skeleton abbreviation first, then add classes and attributes. Remember $ for numbering and {} for text content. Summary and next steps You now have: install steps, key bindings, HTML/CSS/JSX examples, validation checklist, and troubleshooting. Next steps: Create team snippets as Emmet custom snippets. Combine Emmet with UltiSnips/LuaSnip for composite templates. Integrate with LSP/formatter to build a consistent save workflow. References and links Emmet docs: https://docs.emmet.io/ emmet-vim repo: https://github.com/mattn/emmet-vim Vim Python3 provider: https://github.com/neovim/neovim/wiki/FAQ#python-support Meta Estimated reading: 11 minutes; for Vim/Neovim frontend engineers. Tags: vim, neovim, emmet, frontend, productivity; category: frontend. SEO keywords: emmet-vim, Vim Emmet, HTML CSS autocompletion. Updated: 2025-11-14. CTA Create a local demo.html and expand a few abbreviations yourself. If you hit key conflicts or new scenarios, open an issue or comment. If this helped, star mattn/emmet-vim to support the author. ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/frontend/emmet-vim-guide/","summary":"Practical Emmet notes for Vim/Neovim users: install, key mappings, runnable examples, validation checklist, and common pitfalls to 3x your page and component speed.","title":"Emmet-Vim Speed Guide: Write HTML/CSS with Abbreviations"},{"content":" Sorting series post 8 explains two engineering hybrid sorts: TimSort (stable, run detection + merge + insertion) used by Python/Java, and Introsort (quick + heap + insertion, unstable) behind C++ std::sort.\nTarget Readers Anyone who wants to understand built-in sort behavior, stability, and degeneration protection. Engineers who need a hybrid strategy balancing average performance and worst-case bounds. People preparing interviews or talks on TimSort/Introsort. Background and Motivation Pure quicksort can degrade; pure merge needs O(n) memory and does not fully exploit near-sortedness. Hybrid sorting combines strengths: TimSort uses natural runs and stable merge; Introsort falls back to heap when recursion depth is too large and uses insertion on small segments. A - Algorithm TimSort core flow (stable)\nScan array to detect monotonic runs (increasing/decreasing; reverse decreasing runs). Extend short runs to minrun and use insertion to finish them. Merge runs following stack rules with stable merge; near-sorted data yields long runs and fewer merges. Introsort core flow (unstable)\nStart with quicksort (random/median-of-three). When recursion depth exceeds a threshold (~2*log n), switch to heap sort to avoid O(n^2). Use insertion for segments smaller than a threshold (e.g., 16/24). C - Concepts Algorithm Stable Avg Time Worst Time Space Key Points TimSort Yes O(n log n) O(n log n) O(n) run detection + stable merge + insertion Introsort No O(n log n) O(n log n) O(1) quick start + depth fallback to heap + insertion run: a monotonic contiguous subsequence; more runs means more merges. minrun: lower bound on run length (typically 32~64); short runs are extended with insertion. Depth threshold: Introsort uses 2*floor(log2 n) as a cutoff for heap fallback. E - Engineering Scenario 1: Python/Java default sort (TimSort idea) Background: stable sort with excellent performance on near-sorted data.\n# Simplified TimSort skeleton (idea only, not full merge rules) MINRUN = 32 def insertion(a, l, r): for i in range(l+1, r+1): key=a[i]; j=i-1 while j\u0026gt;=l and a[j]\u0026gt;key: a[j+1]=a[j]; j-=1 a[j+1]=key def timsort(a): n=len(a) # 1) detect runs + extend to MINRUN runs=[]; i=0 while i\u0026lt;n: j=i+1 while j\u0026lt;n and a[j]\u0026gt;=a[j-1]: j+=1 # simplified: only increasing runs l,r=i,j-1 if r-l+1 \u0026lt; MINRUN: end=min(n-1,l+MINRUN-1) insertion(a,l,end) r=end runs.append((l,r)) i=r+1 # 2) simplified merge: left to right import heapq while len(runs)\u0026gt;1: l1,r1 = runs.pop(0) l2,r2 = runs.pop(0) merge(a,l1,r1,l2,r2) runs.insert(0,(l1,r2)) return a def merge(a,l1,r1,l2,r2): buf = a[l1:r2+1] i=0; j=l2-l1; k=l1 while i\u0026lt;=r1-l1 and j\u0026lt;=r2-l1: if buf[i] \u0026lt;= buf[j]: a[k]=buf[i]; i+=1 else: a[k]=buf[j]; j+=1 k+=1 while i\u0026lt;=r1-l1: a[k]=buf[i]; i+=1; k+=1 while j\u0026lt;=r2-l1: a[k]=buf[j]; j+=1; k+=1 arr=[5,2,3,1,4] print(timsort(arr)) Scenario 2: C++ std::sort idea (Introsort) Background: low constants, in-place, worst-case bounded.\n#include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; void insertion(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r){ for(int i=l+1;i\u0026lt;=r;++i){int key=a[i], j=i-1; while(j\u0026gt;=l \u0026amp;\u0026amp; a[j]\u0026gt;key){a[j+1]=a[j]; j--;} a[j+1]=key;} } int partition_mid(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r){ int m=l+(r-l)/2; if(a[m]\u0026lt;a[l]) swap(a[m],a[l]); if(a[r]\u0026lt;a[l]) swap(a[r],a[l]); if(a[r]\u0026lt;a[m]) swap(a[r],a[m]); int pivot=a[m]; int i=l-1,j=r+1; while(true){ do{i++;}while(a[i]\u0026lt;pivot); do{j--;}while(a[j]\u0026gt;pivot); if(i\u0026gt;=j) return j; swap(a[i],a[j]); } } void heapsort(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r){ make_heap(a.begin()+l, a.begin()+r+1); sort_heap(a.begin()+l, a.begin()+r+1); } void introsort(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r, int depth){ while(r-l+1 \u0026gt; 16){ if(depth==0){ heapsort(a,l,r); return; } int p = partition_mid(a,l,r); if(p-l \u0026lt; r-p){ introsort(a,l,p,depth-1); l=p+1; } else { introsort(a,p+1,r,depth-1); r=p; } } insertion(a,l,r); } int main(){ vector\u0026lt;int\u0026gt; a={5,2,3,1,4,9,8,7,6}; int depth = 2*log(a.size()); introsort(a,0,a.size()-1,depth); for(int x:a) cout\u0026lt;\u0026lt;x\u0026lt;\u0026lt;\u0026#34; \u0026#34;; } Scenario 3: Go/JavaScript hybrid sorting Go: built-in sort is similar to Introsort (quick + heap + insertion); check source for details. JS: use TimSort implementations for stability; otherwise mimic Introsort for in-place speed. R - Reflection Complexity: TimSort: worst O(n log n), faster on partially sorted data (long runs, fewer merges), space O(n). Introsort: worst O(n log n), average like quicksort, space O(1) ignoring recursion. Stability: TimSort is stable; Introsort is not. Trade-offs: Near-sorted + stable: TimSort (Python/Java default). Memory tight + low constants: Introsort (C++ std::sort). External sorting: TimSort/merge; with data larger than RAM, use multi-way merge. Why it works: hybrid strategies absorb strengths and avoid degeneration paths. S - Summary TimSort uses run detection + stable merge + insertion; excellent on near-sorted data and stable; default in Python/Java. Introsort starts with quicksort, falls back to heap at depth limit, uses insertion at the end; unstable but in-place and fast; used by C++ std::sort. Selection: stable + near-sorted -\u0026gt; TimSort; in-place + worst-case bound -\u0026gt; Introsort; external -\u0026gt; merge/TimSort; known range -\u0026gt; non-comparison. Understanding built-ins helps performance tuning and interviews. Practice Guide / Steps Decide on stability, memory budget, and degree of order. For TimSort: Implement run detection and reverse decreasing runs. Set minrun (32~64) and fill short runs with insertion. Implement stable merge and follow run-stack rules. For Introsort: Depth limit = 2*floor(log2 n); fallback to heap when exceeded. Use insertion for small segments; pivot random or median-of-three. Benchmark with random, nearly sorted, reverse, heavy duplicates. Common Pitfalls and Notes TimSort merge rules are complex; avoid unbalanced run stacks; keep stability. Introsort heap fallback must pass correct subranges; Hoare partition boundaries matter. Small-segment thresholds should be tuned (commonly 16~32). Runnable Example: Mini Introsort in JavaScript function insertion(a,l,r){ for(let i=l+1;i\u0026lt;=r;i++){ const key=a[i]; let j=i-1; while(j\u0026gt;=l \u0026amp;\u0026amp; a[j]\u0026gt;key){ a[j+1]=a[j]; j--; } a[j+1]=key; } } function partition(a,l,r){ const m=l+((r-l)\u0026gt;\u0026gt;1); if(a[m]\u0026lt;a[l]) [a[m],a[l]]=[a[l],a[m]]; if(a[r]\u0026lt;a[l]) [a[r],a[l]]=[a[l],a[r]]; if(a[r]\u0026lt;a[m]) [a[r],a[m]]=[a[m],a[r]]; const pivot=a[m]; let i=l-1,j=r+1; while(true){ do{i++;}while(a[i]\u0026lt;pivot); do{j--;}while(a[j]\u0026gt;pivot); if(i\u0026gt;=j) return j; [a[i],a[j]]=[a[j],a[i]]; } } function heapify(a,n,i,l){ while(true){ let largest=i, left=2*(i-l)+1+l, right=left+1; if(left\u0026lt;n \u0026amp;\u0026amp; a[left]\u0026gt;a[largest]) largest=left; if(right\u0026lt;n \u0026amp;\u0026amp; a[right]\u0026gt;a[largest]) largest=right; if(largest===i) break; [a[i],a[largest]]=[a[largest],a[i]]; i=largest; } } function heapsort(a,l,r){ const n=r+1; for(let i=Math.floor((l+r)/2); i\u0026gt;=l; i--) heapify(a,n,i,l); for(let end=r; end\u0026gt;l; end--){ [a[l],a[end]]=[a[end],a[l]]; heapify(a,end,l,l); } } function introsort(a,l=0,r=a.length-1,depth=2*Math.floor(Math.log2(a.length||1))){ while(r-l+1\u0026gt;16){ if(depth===0){ heapsort(a,l,r); return a; } const p=partition(a,l,r); if(p-l \u0026lt; r-p){ introsort(a,l,p,depth-1); l=p+1; } else { introsort(a,p+1,r,depth-1); r=p; } } insertion(a,l,r); return a; } console.log(introsort([5,2,3,1,4,9,8,7,6])); References and Further Reading Tim Peters, \u0026ldquo;Timsort\u0026rdquo; design notes (CPython source) Java Arrays.sort (object version) implementation Musser, \u0026ldquo;Introspective Sorting and Selection Algorithms\u0026rdquo; (1997) Bentley \u0026amp; McIlroy, \u0026ldquo;Engineering a Sort Function\u0026rdquo; (1993) Meta Reading time: approx. 16 min SEO keywords: TimSort, Introsort, std::sort, stable sort, hybrid sort Meta description: sorting series (8) explaining TimSort and Introsort strategies, stability, trade-offs, and engineering usage. Call to Action (CTA) Benchmark your dataset: built-in sort vs your hybrid implementation. If you need stability and near-sorted performance, try TimSort ideas; for in-place and worst-case bounds, try Introsort. Follow the series finale: the selection guide. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/8.sorting-series-timsort-introsort/","summary":"Explain Python/Java TimSort and C++ std::sort Introsort: triggers, stability, complexity, trade-offs, with skeletons and selection guidance.","title":"Sorting Series (8): TimSort and Introsort - Engineering Patterns Behind Built-in Sorts"},{"content":" Sorting series post 7 focuses on non-comparison sorting: when key range or digit length is bounded, complexity can drop to O(n+k), but space, stability, and feasibility must be balanced.\nTarget Readers Engineers sorting integer keys with known range/digits. Learners seeking lower-than-n-log-n sorting for large batches. People comparing standard library comparison sorts vs non-comparison. Background and Motivation Comparison sorts have a lower bound of Omega(n log n); non-comparison sorts bypass that by using range/digit information. Cost: extra space and restricted applicability; implementation must handle stability and memory carefully. A - Algorithm Covered algorithms: counting sort, bucket sort, radix sort (LSD).\nBasic examples\nCounting: [4, 2, 2, 8, 3], range 0..9, count -\u0026gt; prefix sums -\u0026gt; stable fill. Radix: group by ones/tens/hundreds digits, stable per digit. C - Concepts Algorithm Idea Time Space Stable Counting frequency + prefix sums O(n+k) O(k+n) Can be stable Bucket split by ranges, sort within buckets expected O(n+k) O(n+k) depends on inner sort Radix stable per-digit sort in multiple passes O(d*(n+b)) O(n+b) Yes if each pass is stable k: range size; d: number of digits; b: base (number of buckets). Stability: counting can be stable; radix requires stable passes; bucket depends on inner sort. E - Engineering Scenario 1: Small-range integer sort (Python counting) def counting_sort(a, max_val): cnt = [0]*(max_val+1) for x in a: cnt[x]+=1 # prefix sums for i in range(1, len(cnt)): cnt[i]+=cnt[i-1] out=[0]*len(a) for x in reversed(a): cnt[x]-=1 out[cnt[x]] = x return out print(counting_sort([4,2,2,8,3], 9)) Scenario 2: Bucket sort for known float distribution (JavaScript) Background: floats uniformly distributed in [0,1).\nfunction bucketSort(arr, buckets=10){ const B=Array.from({length:buckets},()=\u0026gt;[]); for(const x of arr){ const idx = Math.min(buckets-1, Math.floor(x*buckets)); B[idx].push(x); } for(const b of B) b.sort((a,b)=\u0026gt;a-b); return B.flat(); } console.log(bucketSort([0.78,0.17,0.39,0.26,0.72,0.94,0.21,0.12,0.23,0.68])); Scenario 3: Radix sort for large integers (Go, LSD) package main import \u0026#34;fmt\u0026#34; func radixLSD(a []int) { maxv := 0 for _,v := range a { if v\u0026gt;maxv { maxv=v } } exp := 1 buf := make([]int, len(a)) for maxv/exp \u0026gt; 0 { cnt := make([]int, 10) for _,v := range a { digit := (v/exp)%10; cnt[digit]++ } for i:=1;i\u0026lt;10;i++ { cnt[i]+=cnt[i-1] } for i:=len(a)-1;i\u0026gt;=0;i-- { d := (a[i]/exp)%10 cnt[d]-- buf[cnt[d]] = a[i] } copy(a, buf) exp *= 10 } } func main(){ a:=[]int{170,45,75,90,802,24,2,66}; radixLSD(a); fmt.Println(a) } Scenario 4: C++ counting sort (small range) #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; vector\u0026lt;int\u0026gt; counting_sort(const vector\u0026lt;int\u0026gt;\u0026amp; a, int maxv){ vector\u0026lt;int\u0026gt; cnt(maxv+1), out(a.size()); for(int x:a) cnt[x]++; for(int i=1;i\u0026lt;=maxv;i++) cnt[i]+=cnt[i-1]; for(int i=(int)a.size()-1;i\u0026gt;=0;i--){ int x=a[i]; cnt[x]--; out[cnt[x]]=x; } return out; } Scenario 5: Rust radix sort (LSD) pub fn radix_lsd(a: \u0026amp;mut [u32]) { let mut maxv = *a.iter().max().unwrap(); let mut exp = 1u32; let n = a.len(); let mut buf = vec![0u32; n]; while maxv/exp \u0026gt; 0 { let mut cnt = [0usize; 10]; for \u0026amp;v in a.iter() { cnt[((v/exp)%10) as usize] += 1; } for i in 1..10 { cnt[i] += cnt[i-1]; } for \u0026amp;v in a.iter().rev() { let d = ((v/exp)%10) as usize; cnt[d] -= 1; buf[cnt[d]] = v; } a.copy_from_slice(\u0026amp;buf); exp *= 10; } } R - Reflection Complexity and prerequisites: Counting: O(n+k), k is range size; if k \u0026raquo; n, not worth it. Bucket: expected O(n+k) depends on distribution; worst-case can degrade. Radix: O(d*(n+b)), d digits, base b; each pass must be stable. Trade-offs: Memory: counting/bucket needs O(k) or O(n+k); large ranges are infeasible. Stability: counting and radix can be stable; bucket depends on inner sort. Data type: best for integers or keys mappable to integers (dates, IPs, fixed-length strings). Why it works: With bounded range/digits, non-comparison sorts break the n log n lower bound. Great for logs, bucketed stats, and bulk integer sorting. S - Summary Non-comparison sorts require known range/digits/distribution and can reach O(n+k). Counting is simple and stable for small ranges; radix works for multi-digit integers; bucket depends on distribution. Core risks: memory blow-up, distribution mismatch, missing stability. Selection: small range -\u0026gt; counting; moderate digits + stable -\u0026gt; radix; uniform floats -\u0026gt; bucket; otherwise compare-based. Practice Guide / Steps Estimate range/digits: if k is close to or larger than n, be cautious. Decide stability: radix needs stable per-digit sort; bucket may need stable inner sort. Control memory: counting array length = max-min+1; radix buffers at least O(n). Test with random, all equal, huge range, skewed distribution. Common Pitfalls and Notes Counting sort with negatives needs offset (or split positive/negative). Radix loses stability if any digit pass is unstable. Bucket sort degrades on skewed distributions; increase bucket count or sort large buckets with other methods. Large memory usage calls for fallback to comparison sort or chunking. Runnable Example: Counting with negatives (Python) def counting_sort_with_neg(a): mn, mx = min(a), max(a) offset = -mn cnt = [0]*(mx - mn + 1) for x in a: cnt[x+offset]+=1 for i in range(1,len(cnt)): cnt[i]+=cnt[i-1] out=[0]*len(a) for x in reversed(a): cnt[x+offset]-=1 out[cnt[x+offset]] = x return out print(counting_sort_with_neg([3,-1,2,-1,0])) References and Further Reading CLRS non-comparison sorting chapters Donald Knuth, \u0026ldquo;The Art of Computer Programming, Vol. 3\u0026rdquo; (Sorting and Searching) Lower bounds and word-RAM model discussions on integer sorting Meta Reading time: approx. 15 min SEO keywords: counting sort, bucket sort, radix sort, non-comparison, O(n+k) Meta description: sorting series (7) explaining prerequisites, complexity, and engineering trade-offs for counting/bucket/radix with multilingual examples. Call to Action (CTA) Estimate range/digits of your dataset and implement a counting or radix sort benchmark. If distribution is skewed, tune bucket counts or use a hybrid inside large buckets. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/7.sorting-series-non-comparison/","summary":"Explain prerequisites, complexity, implementation details, and pitfalls of non-comparison sorts with multilingual examples for counting, bucket, and radix.","title":"Sorting Series (7): Non-Comparison Sorting - Counting, Bucket, Radix and the Range/Digit Tradeoff"},{"content":" For frontend developers with 1-2 years of experience who want a fast, state-driven button in Svelte. Covers state colors, disabled and loading states, accessibility (ARIA), testing, pitfalls, and runnable examples.\nTarget readers and prerequisites Frontend engineers familiar with JS/TS and new to Svelte or already using it. Developers who need a unified button style, state, and interaction in a project. Requirements: Node 18+, Svelte 5, package manager (npm/pnpm), can run npm create svelte@latest. Background / Motivation Buttons are high-frequency interactions, but style, state, and accessibility are often ignored. Dynamic class names without null protection lead to undefined or broken styles. Accessibility (keyboard and ARIA) plus loading/disabled states are product-grade requirements. Consistency needs a centralized state-to-style mapping to avoid magic strings everywhere. Core concepts State mapping: map business states to class strings via a function, not nested ternaries in templates. Optional chaining (?.) and nullish coalescing (??): safely read backend fields and provide defaults. ARIA and keyboard access: aria-busy, aria-disabled, role, tabindex help screen readers and keyboard users. Visual hierarchy: primary, secondary, ghost buttons. Environment and dependencies Node 18+, Svelte 5 UI utility classes: examples use Tailwind (replace with any styling system) Recommended commands: npm create svelte@latest demo-buttons cd demo-buttons npm install Practical steps 1) Centralize state-to-style mapping // statusTone.ts export function statusTone(status?: string) { if (status === \u0026#39;succeeded\u0026#39; || status === \u0026#39;completed\u0026#39;) { return \u0026#39;bg-emerald-600 hover:bg-emerald-700 text-white border border-emerald-600\u0026#39;; } if (status === \u0026#39;failed\u0026#39;) { return \u0026#39;bg-rose-600 hover:bg-rose-700 text-white border border-rose-600\u0026#39;; } if (status === \u0026#39;processing\u0026#39; || status === \u0026#39;pending\u0026#39;) { return \u0026#39;bg-amber-500 hover:bg-amber-600 text-white border border-amber-500\u0026#39;; } return \u0026#39;bg-slate-200 text-slate-700 border border-slate-300\u0026#39;; } Why: keep status-to-class logic centralized and maintainable; supports both completed and succeeded.\n2) Safe values inside a Svelte component \u0026lt;script lang=\u0026#34;ts\u0026#34;\u0026gt; import { statusTone } from \u0026#39;./statusTone\u0026#39;; export let status: string | undefined; export let loading = false; export let label = \u0026#39;Submit\u0026#39;; \u0026lt;/script\u0026gt; \u0026lt;button class={`inline-flex items-center gap-2 rounded-full px-4 py-2 text-sm font-semibold transition ${statusTone(status)}`} aria-busy={loading} aria-disabled={loading} disabled={loading} \u0026gt; {#if loading} \u0026lt;span class=\u0026#34;h-3 w-3 animate-spin rounded-full border-2 border-white border-t-transparent\u0026#34;\u0026gt;\u0026lt;/span\u0026gt; {/if} {label ?? \u0026#39;Submit\u0026#39;} \u0026lt;/button\u0026gt; Notes:\nlabel ?? 'Submit' provides a default label safely. aria-busy, aria-disabled, and disabled stay in sync. 3) Optional chaining and nullish coalescing example {#if detailStatus?.status ?? record.status} \u0026lt;span class=\u0026#34;text-xs text-slate-500\u0026#34;\u0026gt; Current status: {detailStatus?.status ?? record.status ?? \u0026#39;pending\u0026#39;} \u0026lt;/span\u0026gt; {/if} ?. avoids errors if detailStatus is undefined, ?? falls back to a default.\n4) Keyboard and screen reader support For non-\u0026lt;button\u0026gt; elements, add: role=\u0026quot;button\u0026quot;, tabindex=\u0026quot;0\u0026quot;, aria-label=\u0026quot;...\u0026quot;. Handle on:keydown for Enter or Space. Sync loading/disabled state with aria-busy and aria-disabled. 5) Common variants Primary: main action, high-contrast or brand color. Secondary: dark or outline style for secondary actions. Ghost: transparent background with border. Icon button: add aria-label for screen readers. 6) Skeleton loading / disabled strategy Loading: show spinner, block double-submit; use disabled and aria-busy. Disabled: for permission/quota conditions, use weaker style like opacity-60 cursor-not-allowed. 7) Events and error handling Wrap click: set loading optimistically, run async work, reset in finally. On error: show toast, and color with statusTone('failed') if needed. Runnable snippet \u0026lt;script lang=\u0026#34;ts\u0026#34;\u0026gt; import { statusTone } from \u0026#39;./statusTone\u0026#39;; let status: \u0026#39;pending\u0026#39; | \u0026#39;processing\u0026#39; | \u0026#39;succeeded\u0026#39; | \u0026#39;failed\u0026#39; = \u0026#39;pending\u0026#39;; let loading = false; async function simulate() { loading = true; status = \u0026#39;processing\u0026#39;; await new Promise((r) =\u0026gt; setTimeout(r, 1200)); status = Math.random() \u0026gt; 0.5 ? \u0026#39;succeeded\u0026#39; : \u0026#39;failed\u0026#39;; loading = false; } \u0026lt;/script\u0026gt; \u0026lt;div class=\u0026#34;space-y-3\u0026#34;\u0026gt; \u0026lt;button class={`inline-flex items-center gap-2 rounded-full px-4 py-2 text-sm font-semibold transition ${statusTone(status)}`} aria-busy={loading} aria-disabled={loading} disabled={loading} on:click={simulate} \u0026gt; {#if loading} \u0026lt;span class=\u0026#34;h-3 w-3 animate-spin rounded-full border-2 border-white border-t-transparent\u0026#34;\u0026gt;\u0026lt;/span\u0026gt; {/if} {status === \u0026#39;pending\u0026#39; ? \u0026#39;Start\u0026#39; : status === \u0026#39;processing\u0026#39; ? \u0026#39;Processing...\u0026#39; : status === \u0026#39;succeeded\u0026#39; ? \u0026#39;Done\u0026#39; : \u0026#39;Retry\u0026#39;} \u0026lt;/button\u0026gt; \u0026lt;p class=\u0026#34;text-sm text-slate-600\u0026#34;\u0026gt;Current status: {status}\u0026lt;/p\u0026gt; \u0026lt;/div\u0026gt; Run and verify:\nnpm run dev # Page shows the button; click it to see Processing... then success or failure color Common questions and notes Inconsistent status values: backend may return succeeded/completed; handle both. Long class strings: you can use clsx or classnames, but keep mapping logic centralized. Accessibility gaps: custom elements need role/tabindex/aria-label; loading needs aria-busy. Disabled styles: add opacity-60 cursor-not-allowed for clarity. Default text: use ?? instead of || to avoid empty string issues. Testing checklist Unit: statusTone returns expected classes for each state. Component: when loading, button.disabled === true and aria-busy=\u0026quot;true\u0026quot; exists. Accessibility: Tab focuses, Enter/Space triggers; aria-label present for icon buttons. Visual: contrast ratio \u0026gt;= 4.5:1 for text on backgrounds. Best practices Split mapping, structure, and a11y: function (state-\u0026gt;class) + template + accessibility helpers. Define the state machine before styling; avoid scattered magic strings. Default to accessibility: keyboard, screen reader, and synchronized disabled/busy states. Provide a runnable example for team reuse. Summary / Next steps The key is \u0026ldquo;state mapping + safe values + a11y sync\u0026rdquo;. statusTone centralizes styles, ?. and ?? make data safe, ARIA makes it production ready. Next: align with your design system (colors/sizes/icons), publish a Button component, and add Playwright a11y checks. References Svelte docs: events and accessibility MDN: Optional chaining, Nullish coalescing WAI-ARIA Authoring Practices: Button Call to Action (CTA) Copy the example into your component library and replace colors/states. Audit existing buttons for missing aria-* and disabled styles. ","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/frontend/svelte-button-config-guide/","summary":"Build reusable buttons in Svelte: dynamic classes, optional chaining and nullish coalescing, safe defaults, state-driven styles, accessibility, testing, and common pitfalls.","title":"Svelte Button Configuration Guide: States, Styles, and Accessibility"},{"content":" Sorting series post 6 focuses on heap sort: in-place O(n log n), unstable, slightly higher constants but worst-case guaranteed. It is also the foundation of streaming top-k.\nTarget Readers Engineers who need in-place sorting with worst-case O(n log n) guarantees. Learners connecting priority queues, top-k, and heap sort. People comparing quick/merge/heap selection. Background and Motivation Heap sort builds a heap and repeatedly extracts the max; best/avg/worst are all O(n log n). Pros: in-place, worst-case safe. Cons: unstable, cache-unfriendly, higher constants than quicksort. Shares structure with priority queues and streaming top-k, giving strong engineering value. A - Algorithm Steps\nBuild a max-heap (bottom-up O(n)). Repeatedly swap root with tail, shrink heap, sift down to restore heap (O(log n) each). Basic example Array [4, 10, 3, 5, 1]:\nAfter heapify: [10, 5, 3, 4, 1]. Swap root and tail -\u0026gt; [1,5,3,4,10], sift down -\u0026gt; [5,4,3,1,10]. Repeat until sorted. C - Concepts Concept Description Heap property Parent \u0026gt;= children (max-heap), children of i are 2i+1 and 2i+2. Heapify Sift down from last non-leaf to root, O(n). Sift down Swap node downward to restore heap, O(log n). Stability Unstable; swaps can reorder equals. Space In-place O(1) extra space. Complexity\nTime: heapify O(n) + n times sift O(log n) =\u0026gt; O(n log n); worst-case same. Space: O(1); recursion uses O(log n) stack, iterative uses O(1). E - Engineering Scenario 1: In-place backend sorting (C) Background: need in-place and worst-case guarantees.\nvoid heapify(int *a, int n, int i){ while(1){ int l=2*i+1, r=2*i+2, largest=i; if(l\u0026lt;n \u0026amp;\u0026amp; a[l]\u0026gt;a[largest]) largest=l; if(r\u0026lt;n \u0026amp;\u0026amp; a[r]\u0026gt;a[largest]) largest=r; if(largest==i) break; int t=a[i]; a[i]=a[largest]; a[largest]=t; i=largest; } } void heap_sort(int *a, int n){ for(int i=n/2-1;i\u0026gt;=0;i--) heapify(a,n,i); for(int end=n-1; end\u0026gt;0; end--){ int t=a[0]; a[0]=a[end]; a[end]=t; heapify(a,end,0); } } Scenario 2: Streaming top-k (Python, min-heap) Background: maintain top k values in a stream.\nimport heapq def topk(stream, k): h=[] for x in stream: if len(h)\u0026lt;k: heapq.heappush(h, x) else: if x\u0026gt;h[0]: heapq.heapreplace(h, x) return sorted(h, reverse=True) print(topk([5,1,9,3,12,4], 3)) # [12,9,5] Scenario 3: Go priority queue + sorting Background: use container/heap to build a heap-based sort.\npackage main import ( \u0026#34;container/heap\u0026#34; \u0026#34;fmt\u0026#34; ) type IntHeap []int func (h IntHeap) Len() int { return len(h) } func (h IntHeap) Less(i, j int) bool { return h[i] \u0026lt; h[j] } func (h IntHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] } func (h *IntHeap) Push(x interface{}) { *h = append(*h, x.(int)) } func (h *IntHeap) Pop() interface{} { old := *h; n := len(old); x := old[n-1]; *h = old[:n-1]; return x } func heapSort(a []int) []int { h := IntHeap(a) heap.Init(\u0026amp;h) res := make([]int, 0, len(a)) for h.Len()\u0026gt;0 { res = append(res, heap.Pop(\u0026amp;h).(int)) } return res // ascending } func main(){ fmt.Println(heapSort([]int{4,10,3,5,1})) } Scenario 4: Rust in-place heap sort pub fn heap_sort(a: \u0026amp;mut [i32]) { let n = a.len(); // build max-heap for i in (0..n/2).rev() { sift_down(a, i, n); } for end in (1..n).rev() { a.swap(0, end); sift_down(a, 0, end); } } fn sift_down(a: \u0026amp;mut [i32], mut i: usize, n: usize) { loop { let l = 2*i+1; let r = l+1; let mut largest = i; if l \u0026lt; n \u0026amp;\u0026amp; a[l] \u0026gt; a[largest] { largest = l; } if r \u0026lt; n \u0026amp;\u0026amp; a[r] \u0026gt; a[largest] { largest = r; } if largest == i { break; } a.swap(i, largest); i = largest; } } Scenario 5: JavaScript concise version function heapify(a, n, i){ while(true){ let l=2*i+1, r=2*i+2, largest=i; if(l\u0026lt;n \u0026amp;\u0026amp; a[l]\u0026gt;a[largest]) largest=l; if(r\u0026lt;n \u0026amp;\u0026amp; a[r]\u0026gt;a[largest]) largest=r; if(largest===i) break; [a[i],a[largest]]=[a[largest],a[i]]; i=largest; } } function heapSort(a){ const n=a.length; for(let i=Math.floor(n/2)-1;i\u0026gt;=0;i--) heapify(a,n,i); for(let end=n-1;end\u0026gt;0;end--){ [a[0],a[end]]=[a[end],a[0]]; heapify(a,end,0); } return a; } console.log(heapSort([4,10,3,5,1])); R - Reflection Complexity: time O(n log n) in worst/avg/best; space O(1). Alternatives: Need stability -\u0026gt; merge/TimSort. Better constants/cache -\u0026gt; quicksort often faster. Known range -\u0026gt; counting/bucket/radix. Why it works: Worst-case guarantees, suitable when degradation is unacceptable. In-place, good for memory-constrained environments. Shares structure with priority queues and streaming top-k. S - Summary Heap sort: in-place, unstable, worst-case O(n log n), higher constants than quicksort. Engineering: heaps are common for top-k/streaming; full heap sort is less visible in libraries but available (C++ make_heap/sort_heap). If you need stability or near-sorted optimization, use merge/TimSort; for low constants, use quick/Introsort; heap sort shines when you need in-place and worst-case guarantees. Heapify bottom-up is O(n); iterative sift-down avoids recursion. Practice Guide / Steps Implement heapify (bottom-up) and sift-down (iterative); verify child index math. If only top-k is required, maintain a min-heap of size k, space O(k). Benchmark random, reversed, heavy duplicates; record swap counts and time to observe cache effects. For stability, add original index as a secondary key (higher constant). Common Pitfalls and Notes Child index math is easy to get wrong: 2i+1, 2i+2; continue sifting after swap. Recursive sift-down uses O(log n) stack; iterative avoids stack risk. Heap sort is unstable; equal elements may reorder. Runnable Example: Minimal Python def heap_sort(a): n=len(a) def sift(i, size): while True: l,r=2*i+1,2*i+2; largest=i if l\u0026lt;size and a[l]\u0026gt;a[largest]: largest=l if r\u0026lt;size and a[r]\u0026gt;a[largest]: largest=r if largest==i: break a[i],a[largest]=a[largest],a[i]; i=largest for i in range(n//2-1,-1,-1): sift(i,n) for end in range(n-1,0,-1): a[0],a[end]=a[end],a[0] sift(0,end) return a print(heap_sort([4,10,3,5,1])) References and Further Reading CLRS \u0026ldquo;Introduction to Algorithms\u0026rdquo; heap sort C++ std::make_heap / std::sort_heap implementation William Cochran, \u0026ldquo;Heaps and Priority Queues\u0026rdquo; notes Meta Reading time: approx. 14 min SEO keywords: heap sort, in-place sorting, top-k, priority queue Meta description: sorting series (6) explaining heap sort, heapify/sift, complexity, and engineering trade-offs with multilingual examples. Call to Action (CTA) Compare quicksort vs heap sort on the same dataset and observe cache effects. If you need top-k, implement a min-heap and benchmark it. Follow the series: non-comparison sorting, TimSort/Introsort, selection guide. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/6.sorting-series-heap-sort/","summary":"Explain heap sort principles, complexity, and engineering scenarios; compare with quick/merge; include multilingual implementations and top-k examples.","title":"Sorting Series (6): Heap Sort - In-place O(n log n) with Worst-Case Guarantees"},{"content":" Sorting series post 5 focuses on quicksort: average O(n log n), in-place, low constant factors, but it needs pivot strategy and tail recursion optimization to avoid worst-case O(n^2) and deep stacks. This is the ACERS view from theory to engineering practice.\nTarget Readers Developers who want a production-ready quicksort. Readers curious about pivot choice, duplicates handling, tail-recursion/hybrid strategies. People who want to understand std::sort / Introsort design motivation. Background and Motivation Quicksort is often the default because it is in-place, cache-friendly, and fast, but worst-case O(n^2) and duplicates can hurt. Engineering practice uses random pivot, median-of-three, three-way partition, tail recursion, and insertion for small segments. A - Algorithm Theme: Achieve average O(n log n) with in-place and low constants, while mitigating degeneration.\nBasic example Array [3, 5, 2, 2, 8], pivot = 3:\nAfter partition -\u0026gt; [2,2,3,5,8], left \u0026lt; 3, right \u0026gt;= 3. Recurse on left and right segments. C - Concepts Key Concept Description Pivot selection Random, median-of-three (first/mid/last), or median-of-five to reduce degeneration. Partition strategy Lomuto (single side) is simple but swaps more; Hoare (two-side) swaps less; three-way partition handles duplicates. Duplicates Three-way partition (\u0026lt;, =, \u0026gt;) prevents degeneration with many equal keys. Tail recursion Always recurse on smaller side and iterate on larger side to keep stack O(log n). Hybrid strategy Use insertion for small segments; fall back to heap sort when recursion depth is too high (Introsort). Complexity\nAverage time O(n log n), worst-case O(n^2) with extreme pivots. Space: average recursion stack O(log n), worst-case O(n); tail recursion reduces risk. Unstable, in-place. E - Engineering Scenario 1: General backend sorting (Go) Background: dataset ~1e5, random distribution. Why: Go sort.Slice uses quick/heap hybrid; example includes insertion for small segments.\npackage main import \u0026#34;fmt\u0026#34; func insertion(a []int, l, r int) { for i := l+1; i \u0026lt;= r; i++ { key := a[i]; j := i-1 for j \u0026gt;= l \u0026amp;\u0026amp; a[j] \u0026gt; key { a[j+1]=a[j]; j-- } a[j+1]=key } } func partition(a []int, l, r int) int { pivot := a[(l+r)\u0026gt;\u0026gt;1] i, j := l, r for i \u0026lt;= j { for a[i] \u0026lt; pivot { i++ } for a[j] \u0026gt; pivot { j-- } if i \u0026lt;= j { a[i], a[j] = a[j], a[i]; i++; j-- } } return i } func quick(a []int, l, r int) { for r-l+1 \u0026gt; 16 { p := partition(a, l, r) if p-l \u0026lt; r-p { quick(a, l, p-1); l = p } else { quick(a, p, r); r = p-1 } } insertion(a, l, r) } func main(){ arr := []int{3,5,2,2,8,1,7} quick(arr,0,len(arr)-1) fmt.Println(arr) } Scenario 2: Many duplicates (Python three-way) Background: many equal values (e.g., bucketed IDs), two-way partition degrades. Why: three-way partition handles equal keys in one pass.\ndef quick3(a, l=0, r=None): if r is None: r = len(a)-1 while l \u0026lt; r: if r - l + 1 \u0026lt;= 16: for i in range(l+1, r+1): key=a[i]; j=i-1 while j\u0026gt;=l and a[j]\u0026gt;key: a[j+1]=a[j]; j-=1 a[j+1]=key return pivot = a[(l+r)//2] lt, i, gt = l, l, r while i \u0026lt;= gt: if a[i] \u0026lt; pivot: a[lt], a[i] = a[i], a[lt]; lt+=1; i+=1 elif a[i] \u0026gt; pivot: a[i], a[gt] = a[gt], a[i]; gt-=1 else: i+=1 if lt-l \u0026lt; r-gt: quick3(a, l, lt-1); l = gt+1 else: quick3(a, gt+1, r); r = lt-1 return a arr=[3,5,2,2,8,1,7,2,2] quick3(arr) print(arr) Scenario 3: C++ performance-sensitive partition (Hoare + median-of-three) Background: performance-sensitive, need fewer swaps and more robust pivot.\n#include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; int median3(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r){ int m = l + (r-l)/2; if(a[m] \u0026lt; a[l]) swap(a[m], a[l]); if(a[r] \u0026lt; a[l]) swap(a[r], a[l]); if(a[m] \u0026lt; a[r]) swap(a[m], a[r]); // a[r] = median return a[r]; } int partition(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r){ int pivot = median3(a,l,r); int i=l-1, j=r; while(true){ do{ i++; } while(a[i] \u0026lt; pivot); do{ j--; } while(a[j] \u0026gt; pivot); if(i\u0026gt;=j) break; swap(a[i], a[j]); } swap(a[i], a[r]); return i; } void quick(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r){ while(l \u0026lt; r){ if(r-l+1 \u0026lt;= 16){ for(int i=l+1;i\u0026lt;=r;++i){int key=a[i], j=i-1; while(j\u0026gt;=l \u0026amp;\u0026amp; a[j]\u0026gt;key){a[j+1]=a[j]; j--;} a[j+1]=key;} return; } int p = partition(a,l,r); if(p-l \u0026lt; r-p){ quick(a,l,p-1); l=p+1; } else{ quick(a,p+1,r); r=p-1; } } } Scenario 4: JavaScript small arrays with median-of-three Background: mid-size arrays, use median-of-three + insertion threshold.\nfunction insertion(a,l,r){ for(let i=l+1;i\u0026lt;=r;i++){ const key=a[i]; let j=i-1; while(j\u0026gt;=l \u0026amp;\u0026amp; a[j]\u0026gt;key){ a[j+1]=a[j]; j--; } a[j+1]=key; } } function partition(a,l,r){ const m = l + ((r-l)\u0026gt;\u0026gt;1); if(a[m]\u0026lt;a[l]) [a[m],a[l]]=[a[l],a[m]]; if(a[r]\u0026lt;a[l]) [a[r],a[l]]=[a[l],a[r]]; if(a[r]\u0026lt;a[m]) [a[r],a[m]]=[a[m],a[r]]; const pivot = a[r]; let i=l-1; for(let j=l;j\u0026lt;r;j++) if(a[j]\u0026lt;=pivot){ i++; [a[i],a[j]]=[a[j],a[i]]; } [a[i+1],a[r]]=[a[r],a[i+1]]; return i+1; } function quick(a,l=0,r=a.length-1){ while(l\u0026lt;r){ if(r-l+1\u0026lt;=16){ insertion(a,l,r); return; } const p=partition(a,l,r); if(p-l \u0026lt; r-p){ quick(a,l,p-1); l=p+1; } else{ quick(a,p+1,r); r=p-1; } } return a; } console.log(quick([3,5,2,2,8,1,7])); R - Reflection Complexity: average O(n log n), worst O(n^2); stack depth O(log n) avg; tail recursion + insertion threshold keep it bounded. Alternatives: Need stability or guaranteed upper bound -\u0026gt; merge/heap/TimSort. Known range -\u0026gt; counting/bucket/radix. Standard libraries: C++ std::sort uses Introsort (quick + heap + insertion); Python/Java use TimSort (stable). Why this works: Random/median-of-three reduces degeneration. Three-way partition handles duplicates. Tail recursion + insertion threshold reduces stack depth and constants. S - Summary Quicksort strengths: in-place, low constants, cache-friendly, average O(n log n). Risks: extreme pivot -\u0026gt; O(n^2); duplicates -\u0026gt; degeneration; unstable. Robust strategy: random/median-of-three pivot, three-way partition, insertion for small segments, tail recursion; use Introsort ideas when needed. Selection: stability/external -\u0026gt; merge/TimSort; memory tight + random -\u0026gt; quick/Introsort; duplicates -\u0026gt; three-way. Practice Guide / Steps Pivot: random or median-of-three; median-of-five for extra robustness. Duplicates: use three-way partition; otherwise two-way is fine. Set small segment threshold (e.g., 16/24) and insert; set depth limit to fall back to heap (Introsort). Test sets: random, reverse, all equal, many duplicates, large arrays. Common Pitfalls and Notes Lomuto partition swaps more; Hoare partition returns an index that changes recursion boundaries. Deep recursion can overflow stack: use tail recursion or iterative loops. Not handling duplicates causes degeneration; three-way is critical. Fixed first-element pivot degenerates on sorted data. Runnable Examples: Minimal Multilang Python (random pivot + three-way) import random def quick3(a, l=0, r=None): if r is None: r = len(a)-1 while l \u0026lt; r: if r-l+1 \u0026lt;= 16: for i in range(l+1, r+1): key=a[i]; j=i-1 while j\u0026gt;=l and a[j]\u0026gt;key: a[j+1]=a[j]; j-=1 a[j+1]=key return a pivot_i = random.randint(l, r) a[l], a[pivot_i] = a[pivot_i], a[l] pivot = a[l] lt, i, gt = l, l+1, r while i \u0026lt;= gt: if a[i] \u0026lt; pivot: a[lt], a[i] = a[i], a[lt]; lt+=1; i+=1 elif a[i] \u0026gt; pivot: a[i], a[gt] = a[gt], a[i]; gt-=1 else: i+=1 if lt-l \u0026lt; r-gt: quick3(a, l, lt-1); l = gt+1 else: quick3(a, gt+1, r); r = lt-1 return a arr=[3,5,2,2,8,1,7,2,2] quick3(arr); print(arr) C (Hoare partition + insertion threshold) #include \u0026lt;stdlib.h\u0026gt; void insertion(int *a,int l,int r){ for(int i=l+1;i\u0026lt;=r;i++){ int key=a[i], j=i-1; while(j\u0026gt;=l \u0026amp;\u0026amp; a[j]\u0026gt;key){ a[j+1]=a[j]; j--; } a[j+1]=key; } } int partition(int *a,int l,int r){ int pivot=a[(l+r)/2]; int i=l-1, j=r+1; while(1){ do{ i++; } while(a[i]\u0026lt;pivot); do{ j--; } while(a[j]\u0026gt;pivot); if(i\u0026gt;=j) return j; int t=a[i]; a[i]=a[j]; a[j]=t; } } void quick(int *a,int l,int r){ while(l\u0026lt;r){ if(r-l+1\u0026lt;=16){ insertion(a,l,r); return; } int p=partition(a,l,r); if(p-l \u0026lt; r-p){ quick(a,l,p); l=p+1; } else{ quick(a,p+1,r); r=p; } } } C++ (median-of-three + Hoare) int partition(vector\u0026lt;int\u0026gt;\u0026amp; a,int l,int r){ int m=l+(r-l)/2; if(a[m]\u0026lt;a[l]) swap(a[m],a[l]); if(a[r]\u0026lt;a[l]) swap(a[r],a[l]); if(a[r]\u0026lt;a[m]) swap(a[r],a[m]); int pivot=a[m]; int i=l-1,j=r+1; while(true){ do{i++;}while(a[i]\u0026lt;pivot); do{j--;}while(a[j]\u0026gt;pivot); if(i\u0026gt;=j) return j; swap(a[i],a[j]); } } Go (two-way, simplified) func Quick(a []int, l, r int){ for l\u0026lt;r { if r-l+1 \u0026lt;= 16 { insertion(a,l,r); return } p := partition(a,l,r) if p-l \u0026lt; r-p { Quick(a,l,p-1); l=p } else { Quick(a,p,r); r=p-1 } } } Rust (three-way) pub fn quick3(a: \u0026amp;mut [i32]) { fn insertion(a: \u0026amp;mut [i32]) { for i in 1..a.len() { let key=a[i]; let mut j=i as i32-1; while j\u0026gt;=0 \u0026amp;\u0026amp; a[j as usize]\u0026gt;key { a[(j+1) as usize]=a[j as usize]; j-=1; } a[(j+1) as usize]=key; } } fn sort(a: \u0026amp;mut [i32]) { let n=a.len(); if n\u0026lt;=16 { insertion(a); return; } let pivot=a[n/2]; let (mut lt, mut i, mut gt) = (0,0,n-1); while i\u0026lt;=gt { if a[i]\u0026lt;pivot { a.swap(lt,i); lt+=1; i+=1; } else if a[i]\u0026gt;pivot { a.swap(i,gt); if gt==0 {break;} gt-=1; } else { i+=1; } } sort(\u0026amp;mut a[..lt]); sort(\u0026amp;mut a[gt+1..]); } if !a.is_empty() { sort(a); } } JavaScript (median-of-three + insertion) function insertion(a,l,r){ for(let i=l+1;i\u0026lt;=r;i++){ const key=a[i]; let j=i-1; while(j\u0026gt;=l \u0026amp;\u0026amp; a[j]\u0026gt;key){ a[j+1]=a[j]; j--; } a[j+1]=key; } } function quick(a,l=0,r=a.length-1){ while(l\u0026lt;r){ if(r-l+1\u0026lt;=16){ insertion(a,l,r); return a; } const m=l+((r-l)\u0026gt;\u0026gt;1); if(a[m]\u0026lt;a[l]) [a[m],a[l]]=[a[l],a[m]]; if(a[r]\u0026lt;a[l]) [a[r],a[l]]=[a[l],a[r]]; if(a[r]\u0026lt;a[m]) [a[r],a[m]]=[a[m],a[r]]; const pivot=a[m]; let i=l, j=r; while(i\u0026lt;=j){ while(a[i]\u0026lt;pivot) i++; while(a[j]\u0026gt;pivot) j--; if(i\u0026lt;=j){ [a[i],a[j]]=[a[j],a[i]]; i++; j--; } } if(j-l \u0026lt; r-i){ quick(a,l,j); l=i; } else { quick(a,i,r); r=j; } } return a; } console.log(quick([3,5,2,2,8,1,7])); Best Practices Default to language built-in sort; if custom, include random/median-of-three, three-way for duplicates, insertion threshold, tail recursion. Use merge/TimSort for stability; consider Introsort for strict upper bounds. Benchmark on random, reverse, all equal, many duplicates, and large arrays. Conclusion Quicksort is fast and in-place but must use pivot strategy and three-way partition to avoid degeneration. Tail recursion + insertion thresholds are standard engineering practice; fall back to heap if depth grows (Introsort). Selection rules: stability/external -\u0026gt; merge/TimSort; memory-tight random -\u0026gt; quick/Introsort; many duplicates -\u0026gt; three-way. References and Further Reading Hoare, \u0026ldquo;Quicksort\u0026rdquo; (1961) Bentley \u0026amp; McIlroy, \u0026ldquo;Engineering a Sort Function\u0026rdquo; (1993) C++ std::sort and std::stable_sort implementation notes Meta Reading time: approx. 16 min SEO keywords: quicksort, pivot selection, three-way partition, tail recursion, Introsort Meta description: sorting series (5) explaining pivot strategies, duplicate handling, tail recursion, and hybrid optimizations with multilingual implementations. Call to Action (CTA) Benchmark random vs fixed pivots on your real data; compare performance. Add \u0026ldquo;insertion threshold + tail recursion\u0026rdquo; to your implementation and measure stack depth and time. Follow the series: heap sort, non-comparison, TimSort/Introsort, selection guide. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/5.sorting-series-quick-sort/","summary":"Comprehensive quicksort guide: pivot selection, three-way partitioning, tail recursion optimization, hybrid sorting practices, with multilingual implementations and engineering guidance.","title":"Sorting Series (5): Quick Sort - Pivot Strategy, Tail Recursion, Engineering Practice"},{"content":"LeetCode 1: Two Sum Summary Find two indices such that nums[i] + nums[j] = target. Use a hash map for O(n) time.\nApproach Iterate and store value -\u0026gt; index. For each number x, check if target - x exists.\nComplexity Time: O(n) Space: O(n) Python reference implementation def two_sum(nums, target): seen = {} for i, x in enumerate(nums): y = target - x if y in seen: return [seen[y], i] seen[x] = i ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/1-two-sum/","summary":"\u003ch1 id=\"leetcode-1-two-sum\"\u003eLeetCode 1: Two Sum\u003c/h1\u003e\n\u003ch2 id=\"summary\"\u003eSummary\u003c/h2\u003e\n\u003cp\u003eFind two indices such that \u003ccode\u003enums[i] + nums[j] = target\u003c/code\u003e. Use a hash map for O(n) time.\u003c/p\u003e\n\u003ch2 id=\"approach\"\u003eApproach\u003c/h2\u003e\n\u003cp\u003eIterate and store \u003ccode\u003evalue -\u0026gt; index\u003c/code\u003e. For each number \u003ccode\u003ex\u003c/code\u003e, check if \u003ccode\u003etarget - x\u003c/code\u003e exists.\u003c/p\u003e\n\u003ch2 id=\"complexity\"\u003eComplexity\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eTime: O(n)\u003c/li\u003e\n\u003cli\u003eSpace: O(n)\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"python-reference-implementation\"\u003ePython reference implementation\u003c/h2\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#66d9ef\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#a6e22e\"\u003etwo_sum\u003c/span\u003e(nums, target):\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    seen \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e {}\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#66d9ef\"\u003efor\u003c/span\u003e i, x \u003cspan style=\"color:#f92672\"\u003ein\u003c/span\u003e enumerate(nums):\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        y \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e target \u003cspan style=\"color:#f92672\"\u003e-\u003c/span\u003e x\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e y \u003cspan style=\"color:#f92672\"\u003ein\u003c/span\u003e seen:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#66d9ef\"\u003ereturn\u003c/span\u003e [seen[y], i]\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        seen[x] \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e i\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e","title":"LeetCode 1: Two Sum (Hash Map ACERS Summary)"},{"content":"LeetCode 2300: Successful Pairs of Spells and Potions Summary For each spell, count how many potions make spell * potion \u0026gt;= success. Sort potions and binary search the threshold.\nApproach Sort potions. For each spell, compute need = ceil(success / spell). Use binary search to find the first potion \u0026gt;= need. Complexity Time: O(n log n) Space: O(1) extra (or O(n) if sorting a copy) Python reference implementation import bisect import math def successful_pairs(spells, potions, success): potions = sorted(potions) n = len(potions) res = [] for s in spells: need = (success + s - 1) // s idx = bisect.bisect_left(potions, need) res.append(n - idx) return res ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/2300-successful-pairs-of-spells-and-potions/","summary":"\u003ch1 id=\"leetcode-2300-successful-pairs-of-spells-and-potions\"\u003eLeetCode 2300: Successful Pairs of Spells and Potions\u003c/h1\u003e\n\u003ch2 id=\"summary\"\u003eSummary\u003c/h2\u003e\n\u003cp\u003eFor each spell, count how many potions make \u003ccode\u003espell * potion \u0026gt;= success\u003c/code\u003e. Sort potions and binary search the threshold.\u003c/p\u003e\n\u003ch2 id=\"approach\"\u003eApproach\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eSort potions.\u003c/li\u003e\n\u003cli\u003eFor each spell, compute \u003ccode\u003eneed = ceil(success / spell)\u003c/code\u003e.\u003c/li\u003e\n\u003cli\u003eUse binary search to find the first potion \u0026gt;= need.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"complexity\"\u003eComplexity\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eTime: O(n log n)\u003c/li\u003e\n\u003cli\u003eSpace: O(1) extra (or O(n) if sorting a copy)\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"python-reference-implementation\"\u003ePython reference implementation\u003c/h2\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#f92672\"\u003eimport\u003c/span\u003e bisect\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#f92672\"\u003eimport\u003c/span\u003e math\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#66d9ef\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#a6e22e\"\u003esuccessful_pairs\u003c/span\u003e(spells, potions, success):\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    potions \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e sorted(potions)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    n \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e len(potions)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    res \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e []\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#66d9ef\"\u003efor\u003c/span\u003e s \u003cspan style=\"color:#f92672\"\u003ein\u003c/span\u003e spells:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        need \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e (success \u003cspan style=\"color:#f92672\"\u003e+\u003c/span\u003e s \u003cspan style=\"color:#f92672\"\u003e-\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e) \u003cspan style=\"color:#f92672\"\u003e//\u003c/span\u003e s\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        idx \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e bisect\u003cspan style=\"color:#f92672\"\u003e.\u003c/span\u003ebisect_left(potions, need)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        res\u003cspan style=\"color:#f92672\"\u003e.\u003c/span\u003eappend(n \u003cspan style=\"color:#f92672\"\u003e-\u003c/span\u003e idx)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#66d9ef\"\u003ereturn\u003c/span\u003e res\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e","title":"LeetCode 2300: Successful Pairs of Spells and Potions"},{"content":"LeetCode 2379: Minimum Recolors to Get K Consecutive Black Blocks Summary Given a string of \u0026lsquo;B\u0026rsquo; and \u0026lsquo;W\u0026rsquo;, find the minimum recolors to make a substring of length k all black.\nApproach Use a sliding window of length k and count the number of whites in the window. The minimum whites across all windows is the answer.\nComplexity Time: O(n) Space: O(1) Python reference implementation def minimum_recolors(blocks, k): whites = sum(1 for c in blocks[:k] if c == \u0026#39;W\u0026#39;) ans = whites for i in range(k, len(blocks)): if blocks[i-k] == \u0026#39;W\u0026#39;: whites -= 1 if blocks[i] == \u0026#39;W\u0026#39;: whites += 1 ans = min(ans, whites) return ans ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/2379-minimum-recolors-to-get-k-consecutive-black-blocks/","summary":"\u003ch1 id=\"leetcode-2379-minimum-recolors-to-get-k-consecutive-black-blocks\"\u003eLeetCode 2379: Minimum Recolors to Get K Consecutive Black Blocks\u003c/h1\u003e\n\u003ch2 id=\"summary\"\u003eSummary\u003c/h2\u003e\n\u003cp\u003eGiven a string of \u0026lsquo;B\u0026rsquo; and \u0026lsquo;W\u0026rsquo;, find the minimum recolors to make a substring of length \u003ccode\u003ek\u003c/code\u003e all black.\u003c/p\u003e\n\u003ch2 id=\"approach\"\u003eApproach\u003c/h2\u003e\n\u003cp\u003eUse a sliding window of length \u003ccode\u003ek\u003c/code\u003e and count the number of whites in the window. The minimum whites across all windows is the answer.\u003c/p\u003e\n\u003ch2 id=\"complexity\"\u003eComplexity\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eTime: O(n)\u003c/li\u003e\n\u003cli\u003eSpace: O(1)\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"python-reference-implementation\"\u003ePython reference implementation\u003c/h2\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#66d9ef\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#a6e22e\"\u003eminimum_recolors\u003c/span\u003e(blocks, k):\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    whites \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e sum(\u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e \u003cspan style=\"color:#66d9ef\"\u003efor\u003c/span\u003e c \u003cspan style=\"color:#f92672\"\u003ein\u003c/span\u003e blocks[:k] \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e c \u003cspan style=\"color:#f92672\"\u003e==\u003c/span\u003e \u003cspan style=\"color:#e6db74\"\u003e\u0026#39;W\u0026#39;\u003c/span\u003e)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    ans \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e whites\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#66d9ef\"\u003efor\u003c/span\u003e i \u003cspan style=\"color:#f92672\"\u003ein\u003c/span\u003e range(k, len(blocks)):\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e blocks[i\u003cspan style=\"color:#f92672\"\u003e-\u003c/span\u003ek] \u003cspan style=\"color:#f92672\"\u003e==\u003c/span\u003e \u003cspan style=\"color:#e6db74\"\u003e\u0026#39;W\u0026#39;\u003c/span\u003e:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            whites \u003cspan style=\"color:#f92672\"\u003e-=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e blocks[i] \u003cspan style=\"color:#f92672\"\u003e==\u003c/span\u003e \u003cspan style=\"color:#e6db74\"\u003e\u0026#39;W\u0026#39;\u003c/span\u003e:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            whites \u003cspan style=\"color:#f92672\"\u003e+=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        ans \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e min(ans, whites)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#66d9ef\"\u003ereturn\u003c/span\u003e ans\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e","title":"LeetCode 2379: Minimum Recolors to Get K Consecutive Black Blocks"},{"content":"LeetCode 2841: Maximum Sum of Almost Unique Subarray Summary Given an array, window size k, and threshold m, find the maximum sum of any length-k subarray that contains at least m distinct elements.\nApproach Use a sliding window with a frequency map, track window sum and number of distinct values.\nComplexity Time: O(n) Space: O(n) for frequency map Python reference implementation def max_sum_almost_unique(nums, m, k): from collections import defaultdict count = defaultdict(int) distinct = 0 window_sum = 0 ans = 0 for i, x in enumerate(nums): window_sum += x if count[x] == 0: distinct += 1 count[x] += 1 if i \u0026gt;= k: y = nums[i - k] window_sum -= y count[y] -= 1 if count[y] == 0: distinct -= 1 if i \u0026gt;= k - 1 and distinct \u0026gt;= m: ans = max(ans, window_sum) return ans ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/leetcode/2841-maximum-sum-of-almost-unique-subarray/","summary":"\u003ch1 id=\"leetcode-2841-maximum-sum-of-almost-unique-subarray\"\u003eLeetCode 2841: Maximum Sum of Almost Unique Subarray\u003c/h1\u003e\n\u003ch2 id=\"summary\"\u003eSummary\u003c/h2\u003e\n\u003cp\u003eGiven an array, window size \u003ccode\u003ek\u003c/code\u003e, and threshold \u003ccode\u003em\u003c/code\u003e, find the maximum sum of any length-\u003ccode\u003ek\u003c/code\u003e subarray that contains at least \u003ccode\u003em\u003c/code\u003e distinct elements.\u003c/p\u003e\n\u003ch2 id=\"approach\"\u003eApproach\u003c/h2\u003e\n\u003cp\u003eUse a sliding window with a frequency map, track window sum and number of distinct values.\u003c/p\u003e\n\u003ch2 id=\"complexity\"\u003eComplexity\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eTime: O(n)\u003c/li\u003e\n\u003cli\u003eSpace: O(n) for frequency map\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"python-reference-implementation\"\u003ePython reference implementation\u003c/h2\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\u003cspan style=\"color:#66d9ef\"\u003edef\u003c/span\u003e \u003cspan style=\"color:#a6e22e\"\u003emax_sum_almost_unique\u003c/span\u003e(nums, m, k):\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#f92672\"\u003efrom\u003c/span\u003e collections \u003cspan style=\"color:#f92672\"\u003eimport\u003c/span\u003e defaultdict\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    count \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e defaultdict(int)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    distinct \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    window_sum \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    ans \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#66d9ef\"\u003efor\u003c/span\u003e i, x \u003cspan style=\"color:#f92672\"\u003ein\u003c/span\u003e enumerate(nums):\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        window_sum \u003cspan style=\"color:#f92672\"\u003e+=\u003c/span\u003e x\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e count[x] \u003cspan style=\"color:#f92672\"\u003e==\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            distinct \u003cspan style=\"color:#f92672\"\u003e+=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        count[x] \u003cspan style=\"color:#f92672\"\u003e+=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e i \u003cspan style=\"color:#f92672\"\u003e\u0026gt;=\u003c/span\u003e k:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            y \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e nums[i \u003cspan style=\"color:#f92672\"\u003e-\u003c/span\u003e k]\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            window_sum \u003cspan style=\"color:#f92672\"\u003e-=\u003c/span\u003e y\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            count[y] \u003cspan style=\"color:#f92672\"\u003e-=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e count[y] \u003cspan style=\"color:#f92672\"\u003e==\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e0\u003c/span\u003e:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e                distinct \u003cspan style=\"color:#f92672\"\u003e-=\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e        \u003cspan style=\"color:#66d9ef\"\u003eif\u003c/span\u003e i \u003cspan style=\"color:#f92672\"\u003e\u0026gt;=\u003c/span\u003e k \u003cspan style=\"color:#f92672\"\u003e-\u003c/span\u003e \u003cspan style=\"color:#ae81ff\"\u003e1\u003c/span\u003e \u003cspan style=\"color:#f92672\"\u003eand\u003c/span\u003e distinct \u003cspan style=\"color:#f92672\"\u003e\u0026gt;=\u003c/span\u003e m:\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e            ans \u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e max(ans, window_sum)\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e    \u003cspan style=\"color:#66d9ef\"\u003ereturn\u003c/span\u003e ans\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e","title":"LeetCode 2841: Maximum Sum of Almost Unique Subarray"},{"content":" Sorting series post 4 focuses on merge sort: classic divide and conquer, stable, O(n log n) time, with the cost of O(n) extra space. It is the foundation for external sorting and many stable library sorts.\nTarget Readers Engineers who need stability and can afford O(n) extra space. Learners building divide-and-conquer foundations for quicksort/TimSort. People working with huge files or streams and want external merge knowledge. Background and Motivation Merge sort is O(n log n) on all inputs, unaffected by pivot degeneration. The trade-off is O(n) extra space; in-place variants are complex and costly. External sorting (data larger than RAM) uses \u0026ldquo;chunk sort + k-way merge\u0026rdquo; based on merge ideas. A - Algorithm Problem: Sort a comparable sequence, stable, in O(n log n).\nSteps (top-down)\nSplit: recursively divide the array into halves. Conquer: sort left and right halves. Merge: merge two sorted halves using a buffer. Basic example Array [5,2,4,6,1,3]:\nSplit into [5,2,4] and [6,1,3], then split further. Merge [2,4,5] and [1,3,6] -\u0026gt; [1,2,3,4,5,6] (stable preserves order). C - Concepts Key Concept Description Divide and conquer Split into subproblems, solve and merge. Stability When equal, take from left first to preserve relative order. Space Typical implementation uses O(n) buffer; bottom-up still needs buffer. Variants Bottom-up merge, block merge, external k-way merge. Complexity\nTime: T(n) = 2T(n/2) + O(n) =\u0026gt; O(n log n) (best/avg/worst). Space: O(n) buffer (external sort depends on block size). E - Engineering Scenario 1: Stable multi-key sorting (Python) Background: sort logs by date then user_id, need stability. Why: Python\u0026rsquo;s list.sort / sorted use TimSort, which is stable and adaptive, so equal keys keep their original order. Why this example: it shows a real-world multi-key sort where stability matters, and demonstrates that you can rely on the built-in sort instead of hand-writing a merge routine.\nfrom operator import itemgetter logs = [(\u0026#34;2025-11-21\u0026#34;, \u0026#34;u2\u0026#34;), (\u0026#34;2025-11-21\u0026#34;, \u0026#34;u1\u0026#34;), (\u0026#34;2025-11-20\u0026#34;, \u0026#34;u3\u0026#34;)] logs.sort(key=itemgetter(0,1)) print(logs) Scenario 2: External sorting for large files (C++) Background: sort a 10GB integer file with 512MB RAM. Why: chunk sort + k-way merge keeps memory bounded; add a global sequence id to preserve stability for equal keys. Approach: read fixed-size chunks, sort in memory, spill to temp files, then k-way merge with a min-heap. Notes: input is one integer per line; output is sorted integers; chunk_items controls memory use.\n#include \u0026lt;algorithm\u0026gt; // std::sort #include \u0026lt;cstdint\u0026gt; // uint64_t fixed-width integer #include \u0026lt;cstdio\u0026gt; // std::remove for deleting temp files #include \u0026lt;fstream\u0026gt; // std::ifstream/std::ofstream #include \u0026lt;iostream\u0026gt; // std::cerr #include \u0026lt;queue\u0026gt; // std::priority_queue #include \u0026lt;string\u0026gt; // std::string, std::to_string #include \u0026lt;vector\u0026gt; // std::vector // A simple aggregate type. In C++, \u0026#34;struct\u0026#34; defaults to public members. // \u0026#34;uint64_t\u0026#34; is a fixed-width unsigned integer used for a stable tie-breaker. struct Record { int value; uint64_t seq; }; // Each heap entry remembers which temp file it came from. // \u0026#34;size_t\u0026#34; is an unsigned type for sizes/indices. struct HeapItem { Record rec; size_t file_index; }; // Comparator functor for std::priority_queue (a max-heap by default). // By reversing the comparison, we effectively get a min-heap by (value, seq). struct HeapCmp { bool operator()(const HeapItem\u0026amp; a, const HeapItem\u0026amp; b) const { if (a.rec.value != b.rec.value) return a.rec.value \u0026gt; b.rec.value; return a.rec.seq \u0026gt; b.rec.seq; } }; // \u0026#34;static\u0026#34; gives internal linkage (file-local). The \u0026#34;\u0026gt;\u0026gt;\u0026#34; operator reads tokens. // It returns false when the stream fails (EOF or invalid format). static bool read_record(std::ifstream\u0026amp; in, Record\u0026amp; out) { return static_cast\u0026lt;bool\u0026gt;(in \u0026gt;\u0026gt; out.value \u0026gt;\u0026gt; out.seq); } int main(int argc, char** argv) { // argc is the count of arguments; argv is the array of C-strings. if (argc \u0026lt; 3) { std::cerr \u0026lt;\u0026lt; \u0026#34;Usage: \u0026#34; \u0026lt;\u0026lt; argv[0] \u0026lt;\u0026lt; \u0026#34; \u0026lt;input\u0026gt; \u0026lt;output\u0026gt; [chunk_items]\\n\u0026#34;; std::cerr \u0026lt;\u0026lt; \u0026#34;Input format: one integer per line.\\n\u0026#34;; return 1; } std::string input_path = argv[1]; std::string output_path = argv[2]; size_t chunk_items = 1000000; // Default chunk size (items, not bytes). if (argc \u0026gt;= 4) { // std::stoull converts a string to unsigned long long. chunk_items = static_cast\u0026lt;size_t\u0026gt;(std::stoull(argv[3])); if (chunk_items == 0) { std::cerr \u0026lt;\u0026lt; \u0026#34;chunk_items must be \u0026gt; 0\\n\u0026#34;; return 1; } } std::ifstream in(input_path); if (!in) { std::cerr \u0026lt;\u0026lt; \u0026#34;Failed to open input: \u0026#34; \u0026lt;\u0026lt; input_path \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 1; } std::vector\u0026lt;std::string\u0026gt; temp_files; std::vector\u0026lt;Record\u0026gt; buffer; buffer.reserve(chunk_items); uint64_t seq = 0; // Monotonic sequence to preserve stability across chunks. while (true) { buffer.clear(); int v; // Read up to chunk_items values; \u0026#34;\u0026amp;\u0026amp;\u0026#34; short-circuits on EOF. while (buffer.size() \u0026lt; chunk_items \u0026amp;\u0026amp; (in \u0026gt;\u0026gt; v)) { buffer.push_back(Record{v, seq++}); // Aggregate initialization. } if (buffer.empty()) break; // std::sort uses the comparator lambda: \u0026#34;[]\u0026#34; means no captures. std::sort(buffer.begin(), buffer.end(), [](const Record\u0026amp; a, const Record\u0026amp; b) { if (a.value != b.value) return a.value \u0026lt; b.value; return a.seq \u0026lt; b.seq; }); std::string tmp_name = \u0026#34;chunk_\u0026#34; + std::to_string(temp_files.size()) + \u0026#34;.tmp\u0026#34;; std::ofstream out(tmp_name); if (!out) { std::cerr \u0026lt;\u0026lt; \u0026#34;Failed to create temp file: \u0026#34; \u0026lt;\u0026lt; tmp_name \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 1; } for (const auto\u0026amp; rec : buffer) { out \u0026lt;\u0026lt; rec.value \u0026lt;\u0026lt; \u0026#39; \u0026#39; \u0026lt;\u0026lt; rec.seq \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; } temp_files.push_back(tmp_name); if (!in) break; } // Handle empty input: just create an empty output file. if (temp_files.empty()) { std::ofstream out(output_path); return 0; } std::vector\u0026lt;std::ifstream\u0026gt; inputs; inputs.reserve(temp_files.size()); for (const auto\u0026amp; name : temp_files) { // emplace_back constructs the ifstream in-place. inputs.emplace_back(name); if (!inputs.back()) { std::cerr \u0026lt;\u0026lt; \u0026#34;Failed to open temp file: \u0026#34; \u0026lt;\u0026lt; name \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 1; } } // priority_queue\u0026lt;value, container, comparator\u0026gt; std::priority_queue\u0026lt;HeapItem, std::vector\u0026lt;HeapItem\u0026gt;, HeapCmp\u0026gt; heap; for (size_t i = 0; i \u0026lt; inputs.size(); ++i) { Record rec; if (read_record(inputs[i], rec)) { heap.push(HeapItem{rec, i}); } } std::ofstream out(output_path); if (!out) { std::cerr \u0026lt;\u0026lt; \u0026#34;Failed to open output: \u0026#34; \u0026lt;\u0026lt; output_path \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; return 1; } // k-way merge: always take the smallest (value, seq) from the heap. while (!heap.empty()) { HeapItem top = heap.top(); heap.pop(); out \u0026lt;\u0026lt; top.rec.value \u0026lt;\u0026lt; \u0026#39;\\n\u0026#39;; Record next; if (read_record(inputs[top.file_index], next)) { heap.push(HeapItem{next, top.file_index}); } } // Clean up temp files using std::remove from \u0026lt;cstdio\u0026gt;. for (const auto\u0026amp; name : temp_files) { std::remove(name.c_str()); } return 0; } Example (simulate external sorting with tiny memory): Use a small chunk_items so the program must spill to temp files and perform a k-way merge, just like the 10GB/512MB scenario.\nInput (numbers.txt):\n5 1 5 2 2 3 Compile:\ng++ -O2 -std=c++17 external_sort.cpp -o external_sort Run (set chunk_items=3 to force multiple chunks):\n./external_sort numbers.txt sorted.txt 3 Output (sorted.txt):\n1 2 2 3 5 Scenario 3: Frontend stable sorting (JavaScript) Background: table must keep original order for equal keys. Why: modern browser sort is mostly stable; if unsure, use a stable merge.\nfunction mergeSort(arr){ if(arr.length\u0026lt;=1) return arr; const mid = arr.length\u0026gt;\u0026gt;1; const left = mergeSort(arr.slice(0,mid)); const right = mergeSort(arr.slice(mid)); const res=[]; let i=0,j=0; while(i\u0026lt;left.length \u0026amp;\u0026amp; j\u0026lt;right.length){ if(left[i].key \u0026lt;= right[j].key) res.push(left[i++]); else res.push(right[j++]); } return res.concat(left.slice(i)).concat(right.slice(j)); } console.log(mergeSort([{key:1},{key:1},{key:0}])); Scenario 4: Stable backend sorting (Go) Background: stable sort by multiple fields. Why: sort.SliceStable is merge-based and stable.\npackage main import ( \u0026#34;fmt\u0026#34; \u0026#34;sort\u0026#34; ) type Item struct{ Date string; User string } func main(){ items := []Item{{\u0026#34;2025-11-21\u0026#34;,\u0026#34;u2\u0026#34;},{\u0026#34;2025-11-21\u0026#34;,\u0026#34;u1\u0026#34;},{\u0026#34;2025-11-20\u0026#34;,\u0026#34;u3\u0026#34;}} sort.SliceStable(items, func(i, j int) bool { if items[i].Date == items[j].Date { return items[i].User \u0026lt; items[j].User } return items[i].Date \u0026lt; items[j].Date }) fmt.Println(items) } R - Reflection Complexity analysis: time O(n log n), space O(n); external sort cost dominated by I/O and block size. Alternatives: vs quicksort: quick is in-place and small-constant but unstable and can degrade; merge is stable and bounded. vs heap sort: heap is in-place and unstable with poor cache behavior; merge is better for stability/external. vs TimSort: TimSort is faster on nearly sorted data and stable but more complex; merge is its base. Why it is preferred: stable, predictable O(n log n), and the default for external sorting. S - Summary Merge sort provides stable and predictable O(n log n) with O(n) extra space. External sorting, stable multi-key sorting, and many stable libraries rely on merge ideas. Bottom-up merge avoids recursion depth but still needs buffers. If input is nearly sorted and you want more speed, consider TimSort; if memory is tight and stability is not needed, use quick/heap. Evaluate stability needs, memory budget, data size, and I/O cost. Practice Guide / Steps Decide stability and memory budget: if O(n) buffer is ok, use merge/stable libraries; otherwise quick/heap. Choose implementation: top-down is simplest; bottom-up avoids deep recursion. Ensure stability in merge: when equal, pick from left. Edge tests: empty, single, all equal, reverse, heavy duplicates. Runnable Examples: Multilingual Implementations Python (top-down) def merge_sort(a): if len(a) \u0026lt;= 1: return a mid = len(a)//2 left = merge_sort(a[:mid]) right = merge_sort(a[mid:]) i=j=0; res=[] while i \u0026lt; len(left) and j \u0026lt; len(right): if left[i] \u0026lt;= right[j]: res.append(left[i]); i+=1 else: res.append(right[j]); j+=1 res.extend(left[i:]); res.extend(right[j:]) return res print(merge_sort([5,2,4,6,1,3])) C (bottom-up) #include \u0026lt;stdlib.h\u0026gt; void merge(int *a, int *buf, int l, int m, int r){ int i=l, j=m, k=l; while(i\u0026lt;m \u0026amp;\u0026amp; j\u0026lt;r){ if(a[i] \u0026lt;= a[j]) buf[k++] = a[i++]; else buf[k++] = a[j++]; } while(i\u0026lt;m) buf[k++] = a[i++]; while(j\u0026lt;r) buf[k++] = a[j++]; for(int t=l; t\u0026lt;r; ++t) a[t]=buf[t]; } void merge_sort(int *a, int n){ int *buf = malloc(sizeof(int)*n); for(int width=1; width\u0026lt;n; width*=2){ for(int i=0; i\u0026lt;n; i+=2*width){ int l=i, m=i+width\u0026lt; n? i+width: n, r=i+2*width\u0026lt; n? i+2*width: n; merge(a, buf, l, m, r); } } free(buf); } C++ (top-down) void merge(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int m, int r, vector\u0026lt;int\u0026gt;\u0026amp; buf){ int i=l,j=m,k=l; while(i\u0026lt;m \u0026amp;\u0026amp; j\u0026lt;r){ if(a[i]\u0026lt;=a[j]) buf[k++]=a[i++]; else buf[k++]=a[j++]; } while(i\u0026lt;m) buf[k++]=a[i++]; while(j\u0026lt;r) buf[k++]=a[j++]; for(int t=l;t\u0026lt;r;++t) a[t]=buf[t]; } void merge_sort(vector\u0026lt;int\u0026gt;\u0026amp; a, int l, int r, vector\u0026lt;int\u0026gt;\u0026amp; buf){ if(r-l\u0026lt;=1) return; int m = l + (r-l)/2; merge_sort(a,l,m,buf); merge_sort(a,m,r,buf); merge(a,l,m,r,buf); } Go (top-down) func mergeSort(a []int) []int { if len(a) \u0026lt;= 1 { return a } mid := len(a)/2 left := mergeSort(a[:mid]) right := mergeSort(a[mid:]) res := make([]int, 0, len(a)) i, j := 0, 0 for i \u0026lt; len(left) \u0026amp;\u0026amp; j \u0026lt; len(right) { if left[i] \u0026lt;= right[j] { res = append(res, left[i]); i++ } else { res = append(res, right[j]); j++ } } res = append(res, left[i:]...) res = append(res, right[j:]...) return res } Rust (top-down with buffer) fn merge_sort(a: \u0026amp;mut [i32]) { let n = a.len(); if n \u0026lt;= 1 { return; } let mid = n/2; merge_sort(\u0026amp;mut a[..mid]); merge_sort(\u0026amp;mut a[mid..]); let mut buf = a.to_vec(); merge(\u0026amp;a[..mid], \u0026amp;a[mid..], \u0026amp;mut buf[..]); a.copy_from_slice(\u0026amp;buf); } fn merge(left: \u0026amp;[i32], right: \u0026amp;[i32], out: \u0026amp;mut [i32]) { let (mut i, mut j, mut k) = (0,0,0); while i \u0026lt; left.len() \u0026amp;\u0026amp; j \u0026lt; right.len() { if left[i] \u0026lt;= right[j] { out[k]=left[i]; i+=1; } else { out[k]=right[j]; j+=1; } k+=1; } if i \u0026lt; left.len() { out[k..k+left.len()-i].copy_from_slice(\u0026amp;left[i..]); } if j \u0026lt; right.len() { out[k..k+right.len()-j].copy_from_slice(\u0026amp;right[j..]); } } JavaScript (top-down) function mergeSort(a){ if(a.length\u0026lt;=1) return a; const mid = a.length\u0026gt;\u0026gt;1; const left = mergeSort(a.slice(0,mid)); const right = mergeSort(a.slice(mid)); const res=[]; let i=0,j=0; while(i\u0026lt;left.length \u0026amp;\u0026amp; j\u0026lt;right.length){ if(left[i] \u0026lt;= right[j]) res.push(left[i++]); else res.push(right[j++]); } return res.concat(left.slice(i)).concat(right.slice(j)); } console.log(mergeSort([5,2,4,6,1,3])); Common Pitfalls and Notes Recursion depth: for large n use bottom-up iteration or stack tuning. Space usage: evaluate O(n) buffer; external sort requires careful block sizing and merge fan-in. Stability: when equal, take left to preserve order. Performance: use double-buffering or alternating buffers to reduce copies. Best Practices Use stable library sort when available (Python/Java object sort, Go SliceStable). Factor out a merge helper to enforce stability; use bottom-up for large arrays. External sort: control chunk size, use priority queue for k-way merge, batch I/O. For nearly sorted data, consider TimSort; merge is its foundation. Conclusion Merge sort is stable and predictably O(n log n), ideal for stable multi-key sorting and external sorting. Extra space is the main trade-off; in-place variants are complex and uncommon. Bottom-up merge avoids recursion; external merge is essential for huge data. References and Further Reading CLRS \u0026ldquo;Introduction to Algorithms\u0026rdquo; merge sort TimSort paper and CPython/Java sources (run merging) PostgreSQL tuplesort external sorting implementation Meta Reading time: approx. 15 min SEO keywords: merge sort, stable sort, external sorting, divide and conquer Meta description: sorting series (4) explaining merge sort stability, space trade-offs, external sorting, and multilingual implementations. Call to Action (CTA) Benchmark built-in stable sort vs your merge implementation on your dataset. If sorting large files, prototype chunk + k-way merge and measure I/O cost. Follow the series: quicksort, heap sort, non-comparison, TimSort/Introsort, selection guide. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/4.sorting-series-merge-sort/","summary":"Systematic explanation of merge sort principles, stability, space trade-offs, and engineering scenarios with Python/C/C++/Go/Rust/JS implementations and external sorting guidance.","title":"Sorting Series (4): Merge Sort - Stable Divide and Conquer and External Sorting"},{"content":" Sorting series post 3 focuses on Shell sort: grouped insertion with decreasing gaps reduces worst-case O(n^2) toward around O(n log^2 n), making it a key step in understanding \u0026ldquo;local order to global order.\u0026rdquo;\nTarget Readers Learners who know insertion sort and want its higher-level optimization. Engineers needing in-place sorting for mid-size data. People explaining the impact of gap sequences in talks or courses. Background and Motivation Insertion sort is fast when nearly sorted but remains O(n^2) on random arrays. Shell sort uses grouping + shrinking gaps so elements move toward their approximate positions early. Gap sequence choice directly affects performance and complexity. A - Algorithm Problem: Sort a length-n comparable sequence in-place.\nCore steps (gap starts at n/2)\nChoose initial gap; split array into gap-based subsequences. Run insertion sort on each subsequence (step size = gap). Decrease gap and repeat until gap = 1 (equivalent to insertion). Basic example Array [9, 8, 3, 7, 5, 6, 4, 1], gap sequence 4 -\u0026gt; 2 -\u0026gt; 1:\ngap=4: subsequences (0,4),(1,5),(2,6),(3,7), insert sort each to roughly position elements. gap=2: finer groups and more insertion. gap=1: final insertion pass for full order. C - Concepts Key Concept Description Gap sequence Common choices: n/2, Knuth (1,4,13,40,\u0026hellip;), Sedgewick, etc. Determines comparison bounds. Grouped insertion Insertion sort on gap-separated subsequences, moves distant elements early. In-place Uses constant extra space. Stability Classic Shell sort is not stable (cross-gap swaps can reorder equals). Complexity range\nWorst-case depends on gap sequence; simple n/2 can still be O(n^2). Good sequences (e.g., Sedgewick) achieve O(n^(4/3)) or O(n log^2 n), often ~O(n^{1.2-1.3}) in practice. Space: O(1). E - Engineering Scenario 1: Mid-size, memory-sensitive sort (C) Background: embedded/backend arrays (1e4~1e5), need in-place with low memory. Why: Shell sort is in-place with low constants; often better than pure insertion; more stable performance than quick/heap on some distributions.\nvoid shell_sort(int *a, int n) { // Knuth sequence: 1,4,13,40,... until \u0026lt; n/3 int gap = 1; while (gap \u0026lt; n/3) gap = gap * 3 + 1; for (; gap \u0026gt;= 1; gap /= 3) { for (int i = gap; i \u0026lt; n; ++i) { int temp = a[i], j = i; while (j \u0026gt;= gap \u0026amp;\u0026amp; a[j-gap] \u0026gt; temp) { a[j] = a[j-gap]; j -= gap; } a[j] = temp; } } } Scenario 2: Nearly sorted business lists (Python) Background: list appends a few elements; total size \u0026lt;= 1e5. Why: gentle gap sequence moves distant elements into place, then gap=1 finishes with insertion.\ndef shell_sort(arr): n = len(arr) gap = 1 while gap \u0026lt; n // 3: gap = 3 * gap + 1 # Knuth while gap \u0026gt;= 1: for i in range(gap, n): temp = arr[i] j = i while j \u0026gt;= gap and arr[j - gap] \u0026gt; temp: arr[j] = arr[j - gap] j -= gap arr[j] = temp gap //= 3 return arr data = [9,8,3,7,5,6,4,1] print(shell_sort(data)) Scenario 3: Go backend batch sorting Background: per-request sorting length 1e3~1e4; in-place to reduce GC. Why: custom Shell sort as an alternative to reduce allocations.\npackage main import \u0026#34;fmt\u0026#34; func shellSort(a []int) { gap := 1 for gap \u0026lt; len(a)/3 { gap = gap*3 + 1 } for gap \u0026gt;= 1 { for i := gap; i \u0026lt; len(a); i++ { tmp, j := a[i], i for j \u0026gt;= gap \u0026amp;\u0026amp; a[j-gap] \u0026gt; tmp { a[j] = a[j-gap] j -= gap } a[j] = tmp } gap /= 3 } } func main(){ arr := []int{9,8,3,7,5,6,4,1}; shellSort(arr); fmt.Println(arr) } Scenario 4: Frontend large array but low memory (JavaScript) Background: browser handling thousands of records, avoid allocations. Why: in-place and short implementation; Knuth sequence works well.\nfunction shellSort(a){ let gap = 1; while (gap \u0026lt; a.length/3) gap = gap*3 + 1; while (gap \u0026gt;= 1){ for (let i = gap; i \u0026lt; a.length; i++){ const tmp = a[i]; let j = i; while (j \u0026gt;= gap \u0026amp;\u0026amp; a[j-gap] \u0026gt; tmp){ a[j] = a[j-gap]; j -= gap; } a[j] = tmp; } gap = Math.floor(gap/3); } return a; } console.log(shellSort([9,8,3,7,5,6,4,1])); R - Reflection Complexity: Time depends on gap sequence. Knuth works well in practice but can still be O(n^2) worst-case. Sedgewick improves to O(n^(4/3)) bounds. Space: O(1). Alternatives: vs insertion: Shell reduces long-distance moves; gap=1 returns to insertion. vs quick/heap: Shell is more cache-friendly but lacks strict O(n log n) bounds. vs merge: merge is stable but needs O(n) extra space; Shell is in-place but unstable. Why it works: Large gaps quickly move elements toward their approximate positions, reducing later insertion cost. Gap selection is critical; too large yields little benefit, too small leaves too many inversions. S - Summary Shell sort = grouped insertion + decreasing gaps. In-place but unstable; performance hinges on gap sequence. Knuth sequence is a practical default; Sedgewick/Pratt can improve theoretical bounds. Best for mid-size arrays when in-place is required and stability is not. In hybrid strategies, Shell can replace insertion for small segments as a middle layer. Benchmark using real data distribution; theory alone is not enough. Practice Guide / Steps Choose gap: Knuth by default; use Sedgewick if you want better upper bounds. Switching rule: after gap=1, finish with insertion; in hybrids, use Shell below a threshold. Test sets: random, nearly sorted, reversed, heavy duplicates. Record metrics: comparisons/moves, time, cache hits (perf/pprof). Runnable Examples: Multilingual Implementations Python def shell_sort(a): n=len(a); gap=1 while gap \u0026lt; n//3: gap = 3*gap + 1 while gap\u0026gt;=1: for i in range(gap,n): tmp=a[i]; j=i while j\u0026gt;=gap and a[j-gap]\u0026gt;tmp: a[j]=a[j-gap]; j-=gap a[j]=tmp gap//=3 return a print(shell_sort([9,8,3,7,5,6,4,1])) C void shell_sort(int *a, int n){ int gap=1; while(gap \u0026lt; n/3) gap = gap*3 + 1; for(; gap\u0026gt;=1; gap/=3){ for(int i=gap;i\u0026lt;n;i++){ int tmp=a[i], j=i; while(j\u0026gt;=gap \u0026amp;\u0026amp; a[j-gap]\u0026gt;tmp){ a[j]=a[j-gap]; j-=gap; } a[j]=tmp; } } } C++ void shell(vector\u0026lt;int\u0026gt;\u0026amp; a){ int n=a.size(), gap=1; while(gap\u0026lt;n/3) gap=gap*3+1; for(; gap\u0026gt;=1; gap/=3){ for(int i=gap;i\u0026lt;n;i++){ int tmp=a[i], j=i; while(j\u0026gt;=gap \u0026amp;\u0026amp; a[j-gap]\u0026gt;tmp){ a[j]=a[j-gap]; j-=gap; } a[j]=tmp; } } } Go func ShellSort(a []int) { gap := 1 for gap \u0026lt; len(a)/3 { gap = gap*3 + 1 } for gap \u0026gt;= 1 { for i := gap; i \u0026lt; len(a); i++ { tmp, j := a[i], i for j \u0026gt;= gap \u0026amp;\u0026amp; a[j-gap] \u0026gt; tmp { a[j] = a[j-gap] j -= gap } a[j] = tmp } gap /= 3 } } Rust pub fn shell_sort(a: \u0026amp;mut [i32]) { let mut gap = 1usize; while gap \u0026lt; a.len()/3 { gap = gap*3 + 1; } while gap \u0026gt;= 1 { for i in gap..a.len() { let tmp = a[i]; let mut j = i; while j \u0026gt;= gap \u0026amp;\u0026amp; a[j-gap] \u0026gt; tmp { a[j] = a[j-gap]; j -= gap; } a[j] = tmp; } if gap == 1 { break; } gap /= 3; } } JavaScript function shellSort(a){ let gap=1; while(gap \u0026lt; a.length/3) gap = gap*3 + 1; while(gap\u0026gt;=1){ for(let i=gap;i\u0026lt;a.length;i++){ const tmp=a[i]; let j=i; while(j\u0026gt;=gap \u0026amp;\u0026amp; a[j-gap]\u0026gt;tmp){ a[j]=a[j-gap]; j-=gap; } a[j]=tmp; } gap=Math.floor(gap/3); } return a; } Common Pitfalls and Notes Stability: Shell sort is unstable; if stability is required, choose merge/TimSort. Gap choice: simple n/2 often degenerates; use at least Knuth or Sedgewick. Size: for tiny arrays use insertion; for huge arrays evaluate O(n log n) alternatives. Benchmark: gap sequences behave differently across data distributions; measure on your data. Best Practices Default to Knuth sequence for simplicity and performance; use Sedgewick/Pratt for better bounds. In hybrids, replace the \u0026ldquo;small segment threshold\u0026rdquo; with Shell and measure whether it beats insertion. For teaching, visualize gap=4/2/1 passes to illustrate grouped insertion. Count comparisons/moves to evaluate different gap sequences. Conclusion Shell sort uses grouped insertion to reduce long-distance inversions; in-place but unstable. Knuth is practical; for stability or strict bounds use merge/TimSort/heap. As a hybrid layer, Shell can bridge insertion and quick/heap performance. References and Further Reading D. L. Shell, \u0026ldquo;A High-Speed Sorting Procedure\u0026rdquo; (1959) Robert Sedgewick, \u0026ldquo;Analysis of Shellsort and Related Algorithms\u0026rdquo; (1986) CLRS discussion of Shell sort Meta Reading time: approx. 15 min SEO keywords: Shell sort, gap sequence, in-place sorting, unstable sorting Meta description: sorting series (3) explaining gap sequences, complexity, and engineering usage of Shell sort with multilingual implementations. Call to Action (CTA) Compare Knuth vs Sedgewick sequences on your dataset and record timing differences. Replace small-segment insertion in your quicksort with Shell sort and measure the impact. Follow the series: merge, quick, heap, non-comparison, TimSort/Introsort, selection guide. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/3.sorting-series-shell-sort/","summary":"Explain Shell sort principles, gap strategies, and engineering usage with scenarios and Python/C/C++/Go/Rust/JS implementations.","title":"Sorting Series (3): Shell Sort - From Insertion to Gap-Based Efficiency"},{"content":" This is the second post in the sorting series, focusing on three O(n^2) baseline algorithms: bubble, selection, and insertion. They are simple and foundational, useful for understanding higher-level sorts and still valuable on small or nearly sorted data.\nTarget Readers Practice and teaching: need to master baseline sorts and explain them. Engineers: small-scale, embedded, or code-size-sensitive scenarios. Learners: want to understand stability, in-place behavior, and complexity sources. Background and Motivation Pain points: O(n^2) sorts are often ignored, but they represent the three core ideas: swap, select, insert. On small or nearly sorted data, insertion sort can outperform O(n log n) algorithms. We need a side-by-side comparison of stability, swaps/moves, and engineering use. A - Algorithm Theme: Compare bubble (swap-driven), selection (min selection), and insertion (prefix insertion) with a basic example.\nExample array: [5, 2, 4, 6, 1]\nBubble: adjacent swaps bubble the max to the end; repeat n passes. Selection: pick min each pass and swap into place; at most n swaps. Insertion: maintain a sorted prefix and insert current element; efficient when nearly sorted. Illustration (first two insertion passes):\nPass 1: |5| 2 4 6 1 -\u0026gt; 2 5 4 6 1 Pass 2: 2 |5| 4 6 1 -\u0026gt; 2 4 5 6 1 C - Concepts Algorithm Idea Stable In-place Comparisons (avg) Swaps/Moves Bubble Adjacent swaps Yes Yes O(n^2) O(n^2) swaps Selection Select min each pass No Yes O(n^2) O(n) swaps Insertion Insert into sorted prefix Yes Yes O(n^2) O(n^2) moves; O(n) when nearly sorted Where they fit\nBubble: teaching, stability requirement, tiny arrays. Selection: high swap cost (large objects), comparisons are acceptable. Insertion: small arrays, nearly sorted; also used as a subroutine in TimSort/Shell. E - Engineering Scenario 1: Embedded firmware small arrays (C) Background: microcontroller sorting at most tens of integers. Why: short code, in-place, no extra memory; selection sort has few swaps.\n// Selection sort, in-place O(1) space void selection_sort(int *a, int n) { for (int i = 0; i \u0026lt; n - 1; ++i) { int min_i = i; for (int j = i + 1; j \u0026lt; n; ++j) if (a[j] \u0026lt; a[min_i]) min_i = j; if (min_i != i) { int tmp = a[i]; a[i] = a[min_i]; a[min_i] = tmp; } } } Scenario 2: Nearly sorted small lists (Python) Background: UI list receives a few new items; data is mostly ordered. Why: insertion sort runs close to O(n) when inversions are small.\ndef insertion_sort(arr): for i in range(1, len(arr)): key = arr[i] j = i - 1 while j \u0026gt;= 0 and arr[j] \u0026gt; key: arr[j + 1] = arr[j] j -= 1 arr[j + 1] = key return arr data = [1, 2, 3, 5, 4] print(insertion_sort(data)) Scenario 3: Teaching visualization (JavaScript) Background: show swap vs insert in a classroom or demo. Why: bubble sort is stable and intuitive; JS is short and visual.\nfunction bubbleSort(arr) { const a = [...arr]; for (let i = 0; i \u0026lt; a.length; i++) { let swapped = false; for (let j = 0; j \u0026lt; a.length - i - 1; j++) { if (a[j] \u0026gt; a[j + 1]) { [a[j], a[j + 1]] = [a[j + 1], a[j]]; swapped = true; } } if (!swapped) break; // small optimization } return a; } console.log(bubbleSort([5, 2, 4, 6, 1])); Scenario 4: Small batch sorting (Go) Background: request payload size \u0026lt; 64, prefer smaller constants. Why: Go sort switches to insertion for small segments; show a minimal implementation.\npackage main import \u0026#34;fmt\u0026#34; func insertionSort(a []int) { for i := 1; i \u0026lt; len(a); i++ { key := a[i] j := i - 1 for j \u0026gt;= 0 \u0026amp;\u0026amp; a[j] \u0026gt; key { a[j+1] = a[j] j-- } a[j+1] = key } } func main() { arr := []int{5, 2, 4, 6, 1} insertionSort(arr) fmt.Println(arr) } R - Reflection Complexity: all three have O(n^2) time in worst/avg, O(1) space. Stability: bubble and insertion are stable; selection is not (swap can break order). Alternatives: Small arrays: insertion beats bubble/selection; also used as fallback in TimSort/Introsort. Large arrays: switch to O(n log n) (quick/merge/heap) or non-comparison. Why keep them: Teaching value: clear view of comparisons, swaps, and moves. Engineering value: tiny input, nearly sorted, code size constraints, or as hybrid submodules. S - Summary Bubble/selection/insertion represent swap/select/insert ideas and are core to understanding sorting. Stability: bubble and insertion are stable; selection is not but uses few swaps. On small or nearly sorted data, insertion often beats O(n log n) algorithms. Modern sort implementations use hybrids: large segments use quick/heap/merge, small segments fall back to insertion. Choose based on size, degree of order, stability, and swap cost. Practice Guide / Steps If n \u0026lt; 64 and nearly sorted, prefer insertion. Need stability and visualization: bubble with early-exit optimization. Swap cost is high: selection reduces swaps. Use insertion as a subroutine in quick/merge for small segments. Runnable Examples (Multilang Baselines) Python - Insertion def insertion_sort(a): for i in range(1, len(a)): key = a[i]; j = i - 1 while j \u0026gt;= 0 and a[j] \u0026gt; key: a[j+1] = a[j]; j -= 1 a[j+1] = key return a print(insertion_sort([5,2,4,6,1])) C - Selection void selection_sort(int *a, int n) { for (int i = 0; i \u0026lt; n - 1; ++i) { int min_i = i; for (int j = i + 1; j \u0026lt; n; ++j) if (a[j] \u0026lt; a[min_i]) min_i = j; if (min_i != i) { int t=a[i]; a[i]=a[min_i]; a[min_i]=t; } } } C++ - Bubble #include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; void bubble(vector\u0026lt;int\u0026gt;\u0026amp; a){ for(size_t i=0;i\u0026lt;a.size();++i){ bool swapped=false; for(size_t j=0;j+1\u0026lt;a.size()-i;++j){ if(a[j]\u0026gt;a[j+1]){swap(a[j],a[j+1]);swapped=true;} } if(!swapped) break; } } int main(){vector\u0026lt;int\u0026gt; a={5,2,4,6,1}; bubble(a); for(int x:a) cout\u0026lt;\u0026lt;x\u0026lt;\u0026lt;\u0026#34; \u0026#34;;} Go - Insertion func insertion(a []int){ for i:=1;i\u0026lt;len(a);i++{ key:=a[i]; j:=i-1 for j\u0026gt;=0 \u0026amp;\u0026amp; a[j]\u0026gt;key { a[j+1]=a[j]; j-- } a[j+1]=key } } Rust - Insertion fn insertion_sort(a: \u0026amp;mut [i32]) { for i in 1..a.len() { let key = a[i]; let mut j = i as i32 - 1; while j \u0026gt;= 0 \u0026amp;\u0026amp; a[j as usize] \u0026gt; key { a[(j+1) as usize] = a[j as usize]; j -= 1; } a[(j+1) as usize] = key; } } fn main(){ let mut v = vec![5,2,4,6,1]; insertion_sort(\u0026amp;mut v); println!(\u0026#34;{:?}\u0026#34;, v); } JavaScript - Bubble function bubbleSort(a){ for(let i=0;i\u0026lt;a.length;i++){ let swapped=false; for(let j=0;j\u0026lt;a.length-i-1;j++){ if(a[j]\u0026gt;a[j+1]){[a[j],a[j+1]]=[a[j+1],a[j]];swapped=true;} } if(!swapped) break; } return a; } console.log(bubbleSort([5,2,4,6,1])); Explanation and Trade-offs Bubble vs Selection: bubble is stable but swaps more; selection swaps less but is unstable. Insertion vs Bubble: insertion does fewer moves; nearly sorted arrays can drop to O(n). Hybrid strategy: quick/heap for large segments, insertion for small segments. Common Pitfalls and Notes Bubble without early exit wastes O(n^2) on already sorted input. Selection with large structs: fewer swaps but each swap is costly; for stability use index arrays. Insertion degrades on large arrays; best for small or nearly sorted segments. Best Practices Keep a comparison table for stability, swaps/moves, and constants. Write an insertion helper and reuse it in custom quick/merge sorts. Test with sorted, reversed, many duplicates, and nearly sorted to observe early-exit behavior. Conclusion The O(n^2) trio is the foundation and a key component of hybrid sorting. For nearly sorted or small inputs, insertion remains a strong choice. Stable needs -\u0026gt; bubble or insertion; swap cost sensitive -\u0026gt; selection or stable variants. References and Further Reading CLRS chapters on insertion/bubble/selection sort CPython TimSort implementation notes (insertion thresholds) Intel/AMD notes on cache effects for small array sorting Meta Reading time: approx. 14 min SEO keywords: bubble sort, selection sort, insertion sort, O(n^2) sorting, stability Meta description: sorting series (2) comparing bubble/selection/insertion principles, stability, scenarios, and multilingual implementations for small or nearly sorted data. Call to Action (CTA) Benchmark a small dataset (e.g., 50 log rows) with all three algorithms. Add a \u0026ldquo;\u0026lt;= 32 switch to insertion\u0026rdquo; optimization in your quick/merge and measure the gain. Follow the series: shell, merge, quick, heap, non-comparison, TimSort/Introsort, selection guide. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/2.sorting-series-on2-baseline/","summary":"Systematic ACERS explanation of bubble/selection/insertion sorts: principles, stability, scenarios, and multilingual implementations with selection guidance.","title":"Sorting Series (2): Bubble, Selection, Insertion - Three O(n^2) Baselines"},{"content":" For readers preparing a systematic sorting series: this is the preface. We first build a selection map with the ACERS framework so you can quickly decide when to use quicksort, merge, heap, counting/radix, TimSort, Introsort, and more.\nTarget Readers Practice and study: want a structured overview to write a sorting series. Backend/data engineers: care about memory, stability, and parallel scenarios. Teaching and sharing: need a reusable framework and examples. Background and Motivation Pain points: too many sorting algorithms with similar names; stability/complexity are easy to mix up; engineering requires cache behavior, external sort, and language defaults. Goal: provide a \u0026ldquo;selection cheat sheet + scenarios + code skeletons\u0026rdquo; so the rest of the series has consistent structure and language. A - Algorithm Theme: How to choose a sorting algorithm based on input size, data distribution, and stability requirements.\nBasic examples\nExample 1: small array (\u0026lt;= 30) and nearly sorted -\u0026gt; insertion sort. Example 2: mid-size random array (~1e4) -\u0026gt; quicksort or Introsort. Example 3: huge integer keys with narrow range (\u0026lt;= 1e6) -\u0026gt; counting or bucket sort. Simple input/output\nInput: an array/slice of comparable elements Output: array/slice sorted in non-decreasing order C - Concepts Algorithm Avg Time Space Stable In-place Notes Bubble/Selection/Insertion O(n^2) O(1) Bubble/Insertion yes Yes/Yes/Yes baseline/teaching Shell between O(n^2) and O(n log n) O(1) No Yes gap sequence matters Merge O(n log n) O(n) Yes No good for external sort Quick O(n log n) avg; worst O(n^2) O(log n) No Yes pivot choice is key Heap O(n log n) O(1) No Yes good for streaming top-k Counting/Bucket/Radix O(n + k) O(n + k) Counting/Radix stable No/depends known range/digits required TimSort O(n log n) O(n) Yes No default in Python/Java Introsort O(n log n) O(1) No Yes C++ std::sort Categories\nDivide and conquer: merge, quick. Heap-based: heap sort. Increment-based: shell. Non-comparison: counting, bucket, radix. Engineering hybrids: TimSort (insertion + merge), Introsort (quick + heap + insertion). E - Engineering Scenario 1: Batch analytics (Python) Background: process 1e6 log rows, stable sort by timestamp to keep original order for ties. Why: Python built-in sort uses TimSort, stable and fast on partially sorted data.\nfrom operator import itemgetter logs = [ (\u0026#34;2025-11-01T10:00:00\u0026#34;, \u0026#34;user1\u0026#34;, 3), (\u0026#34;2025-11-01T10:00:00\u0026#34;, \u0026#34;user2\u0026#34;, 1), (\u0026#34;2025-11-01T10:00:01\u0026#34;, \u0026#34;user3\u0026#34;, 2), ] # Sort by timestamp ascending; stable keeps original order for ties logs.sort(key=itemgetter(0)) print(logs) Scenario 2: Backend pagination sort (Go) Background: sort items by price ascending and sales descending; dataset \u0026lt; 1e5. Why: sort.Slice is in-place and flexible; standard library is sufficient here.\npackage main import ( \u0026#34;fmt\u0026#34; \u0026#34;sort\u0026#34; ) type Item struct { Price int; Sales int } func main() { items := []Item{{100, 50}, {80, 200}, {100, 120}} sort.Slice(items, func(i, j int) bool { if items[i].Price == items[j].Price { return items[i].Sales \u0026gt; items[j].Sales // sales desc } return items[i].Price \u0026lt; items[j].Price }) fmt.Println(items) } Scenario 3: Memory-limited offline sort (C++, external merge) Background: sort a 10GB integer file with only 512MB RAM. Why: external sort requires chunking + multi-way merge; stable and memory-bound.\n#include \u0026lt;bits/stdc++.h\u0026gt; using namespace std; int main() { vector\u0026lt;int\u0026gt; buf; buf.reserve(1 \u0026lt;\u0026lt; 20); // ~1M ints vector\u0026lt;string\u0026gt; tmpFiles; int x; int chunk = 0; while (cin \u0026gt;\u0026gt; x) { buf.push_back(x); if (buf.size() == buf.capacity()) { sort(buf.begin(), buf.end()); string name = \u0026#34;chunk\u0026#34; + to_string(chunk++) + \u0026#34;.tmp\u0026#34;; ofstream out(name); for (int v : buf) out \u0026lt;\u0026lt; v \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; tmpFiles.push_back(name); buf.clear(); } } // Omit the final chunk and k-way merge; show the idea cerr \u0026lt;\u0026lt; \u0026#34;chunks: \u0026#34; \u0026lt;\u0026lt; tmpFiles.size() \u0026lt;\u0026lt; \u0026#34;\\n\u0026#34;; } Scenario 4: Frontend table sorting (JavaScript) Background: sort by multiple columns and keep order for equal keys (stable). Why: modern browsers usually have stable Array.prototype.sort; if unsure, map indices and sort.\nconst rows = [ { price: 100, sales: 50 }, { price: 100, sales: 120 }, { price: 80, sales: 200 }, ]; rows .map((row, idx) =\u0026gt; ({ ...row, idx })) .sort((a, b) =\u0026gt; a.price - b.price || a.idx - b.idx) .forEach(r =\u0026gt; console.log(r)); R - Reflection Complexity and space: O(n log n) workhorses: merge (stable, not in-place), quick (in-place, worst-case), heap (in-place, cache-unfriendly). O(n + k) non-comparison: counting/bucket/radix, only when range/digits are bounded. O(n^2) baseline: bubble/selection/insertion, mostly for teaching or tiny arrays. Alternatives: External sort vs in-memory: if data exceeds RAM, chunk + merge is mandatory. TimSort vs plain merge: TimSort is faster on partially sorted data and is the engineering default. Introsort vs plain quicksort: depth fallback to heap avoids worst-case O(n^2). Why this selection is reasonable: Stability first: merge/TimSort/counting/radix. Memory first: quick/heap/Introsort (in-place). Range known: counting/bucket/radix. Huge data: external merge with multi-way streaming. S - Summary Selection depends on data size, distribution, stability needs, and memory/external constraints. Default to the language built-in sort (often TimSort/Introsort); customize only for special needs. Non-comparison sorting can drop complexity to O(n + k) when range/digits are bounded. External sorting is essential for data larger than memory; core idea is chunking + multi-way merge. Choose criteria first (time, space, stability), then the algorithm. Practice Guide / Steps Step 1: Evaluate size and distribution (random/near-sorted/heavy duplicates). Step 2: Decide stability requirement and memory budget. Step 3: Use the table to pick a baseline; for Python/Java, prefer built-in stable sort. Step 4: Write three boundary tests: all equal, reverse, nearly sorted. Step 5: Benchmark large data and record time/memory. Runnable Example (Quick Benchmark, Python) import random, time def bench(n=100000): arr = [random.randint(0, 1000000) for _ in range(n)] t0 = time.time(); sorted(arr); t1 = time.time() print(f\u0026#34;n={n}, timsort time={t1 - t0:.3f}s\u0026#34;) if __name__ == \u0026#34;__main__\u0026#34;: bench(200000) Common Pitfalls and Notes Forgetting comparator logic in Array.sort / sort.Slice may break stability or ordering. Fixed pivot in quicksort degenerates on sorted input; use random or median-of-three. Counting/bucket sort can blow memory if range is large; estimate min/max first. External sort with too many temp files requires k-way merge or staged merging. Best Practices Use standard library sorts in production unless you have explicit constraints. Write comparator + tests before implementing the sort; verify stability if needed. Sample large data to judge whether bucket/radix is appropriate or external sort is needed. Require a \u0026ldquo;sorting algorithm + rationale\u0026rdquo; note in PRs for review. Conclusion This preface provides the ACERS selection map for the series. Next steps: expand by category (O(n^2) baseline, shell, merge, quick, heap, non-comparison, TimSort, Introsort, selection playbook). References and Further Reading CLRS \u0026ldquo;Introduction to Algorithms\u0026rdquo; sorting chapters TimSort paper and CPython listobject.c C++ std::sort / std::stable_sort implementation notes PostgreSQL external sort (tuplesort) Meta Reading time: approx. 12 min SEO keywords: sorting selection, algorithm stability, TimSort, Introsort, external sort Meta description: sorting series preface using ACERS to summarize complexity, stability, and engineering scenarios with runnable examples and a selection checklist. Call to Action (CTA) Write a \u0026ldquo;sorting selection checklist\u0026rdquo; for your project with size/distribution/stability notes. Run the Python benchmark above with your real data distribution. Follow the series (quicksort, merge, heap, non-comparison, TimSort/Introsort, selection playbook) and replicate it with the ACERS template. ","permalink":"https://shio-chan-dev.github.io/jeanblog/alg/sorting/1.sorting-series-preface/","summary":"Use the ACERS template to map common sorting algorithms by scenario, complexity, stability, and engineering usage, with runnable examples and a selection checklist.","title":"Sorting Series (1): How to Choose an Algorithm - Time, Space, Stability, Scenarios"},{"content":"UFW + CrowdSec: Stop Malicious Port Scans Subtitle / Abstract: How do you protect exposed server ports? This guide shows how to move past Fail2ban regex hell and build a stable, automated, intelligent port-scan defense system.\nTarget readers Developers using FRP or reverse tunnels Operators of cloud servers (Tencent, Alibaba, AWS, etc.) Linux users who want to stop port scans and SSH brute force People using Fail2ban who want a modern alternative Anyone improving personal server security Background / Motivation: Why you need port-scan defense When you run FRP (frps + frpc) or expose multiple ports, you will often see:\nMassive scans: repeated SYN probes Malicious connection attempts SSH password brute force Automated scans of 6001-6010, 7000, 22, 8080, etc. Traditional approaches have weaknesses:\nUFW only blocks passively Fail2ban is regex-heavy, error-prone, and lacks behavior analysis FRPS logs are hard to match in Fail2ban Attacks still consume frps/sshd resources and can cause slowdowns We need a modern system: no regex, auto detection, intelligent IP banning.\nCore concepts FRP (frps / frpc): reverse tunnel tool, often exposes many TCP ports (e.g., 6001-6010) UFW: Ubuntu firewall, but not intelligent Fail2ban: log-matching ban tool that requires regex CrowdSec (recommended): modern open-source IPS that detects port scans and brute force with behavior analysis and low resource usage Practical guide: Auto-block port scans with CrowdSec (Ubuntu/Debian) 1) Install CrowdSec curl -s https://packagecloud.io/install/repositories/crowdsec/crowdsec/script.deb.sh | sudo bash sudo apt install crowdsec -y 2) Install firewall bouncer (iptables/UFW) sudo apt install crowdsec-firewall-bouncer-iptables CrowdSec will manage blocking automatically.\n3) What it detects out of the box TCP port scans FRP brute-force attempts SSH brute force High-rate connections (DoS-like) Suspicious sequences (behavior analysis) No extra rules needed for 6001-6010 and other ports.\n4) View banned IPs sudo cscli decisions list Example:\nID Scope Value Reason Duration 1 Ip 195.24.237.176 portscan 4h 2 Ip 213.199.63.251 ssh-bf 24h 5) Manual ban (optional) sudo cscli decisions add --ip 195.24.237.176 6) Dashboard (optional) sudo apt install crowdsec-lapi Why CrowdSec \u0026gt; Fail2ban Feature Fail2ban CrowdSec Port-scan detection No Yes (auto) FRP log support Regex heavy No log match needed Config complexity High Low Performance Medium Very low Extensibility Weak Modular + behavior analysis Visualization None Dashboard Resource usage Medium RAM \u0026lt; 20MB CrowdSec is a modern replacement for Fail2ban with lower overhead and stronger detection.\nFail2ban pitfalls (why it fails in FRP scenarios) FRPS logs are complex; IP fields shift and are inconsistent Regex must be perfect; a small mistake matches nothing Logs include colons, brackets, and ports, which break patterns Host IP may be internal (e.g., 10.5.100.2), causing mismatched source IPs UFW log formats vary; Fail2ban cannot extract IP reliably Encoding issues can lead to \u0026ldquo;No failure-id group\u0026rdquo; Risks and notes Blocking can briefly impact FRP or SSH; always keep a backup access method (cloud console). CrowdSec may false-positive crawlers. Whitelist trusted IPs: sudo cscli machines list sudo cscli decisions delete --ip \u0026lt;trusted-ip\u0026gt; FRP often hides real client IPs in logs; CrowdSec works at the kernel level, so it still sees the source. Best practices Replace Fail2ban with CrowdSec (strongly recommended) Close unused FRP ports, use strong tokens and encryption Use SSH keys only, disable password auth Keep UFW default deny incoming Check bans regularly: cscli decisions list Consider Cloudflare Tunnel as an alternative to FRP Summary This guide covered:\nHow to detect and block port scans Why Fail2ban regex often fails Why FRP logs are a poor fit for Fail2ban How CrowdSec provides automated, low-maintenance protection Final solution: UFW + CrowdSec = stable, automated, low-maintenance server defense.\nReferences CrowdSec docs: https://doc.crowdsec.net CrowdSec bouncer: https://github.com/crowdsecurity/cs-firewall-bouncer Fail2ban docs: https://fail2ban.readthedocs.io FRP: https://github.com/fatedier/frp UFW docs: https://wiki.ubuntu.com/UFW ","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/ufw-crowdsec-portscan/","summary":"\u003ch1 id=\"ufw--crowdsec-stop-malicious-port-scans\"\u003eUFW + CrowdSec: Stop Malicious Port Scans\u003c/h1\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract:\u003c/strong\u003e How do you protect exposed server ports? This guide shows how to move past Fail2ban regex hell and build a stable, automated, intelligent port-scan defense system.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eDevelopers using FRP or reverse tunnels\u003c/li\u003e\n\u003cli\u003eOperators of cloud servers (Tencent, Alibaba, AWS, etc.)\u003c/li\u003e\n\u003cli\u003eLinux users who want to stop port scans and SSH brute force\u003c/li\u003e\n\u003cli\u003ePeople using Fail2ban who want a modern alternative\u003c/li\u003e\n\u003cli\u003eAnyone improving personal server security\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"background--motivation-why-you-need-port-scan-defense\"\u003eBackground / Motivation: Why you need port-scan defense\u003c/h2\u003e\n\u003cp\u003eWhen you run FRP (frps + frpc) or expose multiple ports, you will often see:\u003c/p\u003e","title":"UFW + CrowdSec: Stop Malicious Port Scans (From Fail2ban Pain to a Modern Solution)"},{"content":"WireGuard Full Guide: Build a Secure High-Speed Private Network (VPN Tutorial) Subtitle / Abstract: A beginner-to-intermediate WireGuard VPN guide. Learn to build a fast, secure private network and enforce a zero-exposure model where services are only reachable through VPN.\nTarget readers People who want to hide server or PC ports behind a VPN Users who want to reduce scanning and brute force risk Anyone building a private LAN or remote access to home Linux/Windows users, developers, and ops beginners Background and motivation: Why WireGuard? If you expose ports to the public internet (SSH, databases, admin panels), you will face:\nconstant scans brute force attempts automated probes potential intrusion risk OpenVPN is mature but heavy, slower, and complex to configure.\nWireGuard is built for modern security:\nsmall, secure, fast (next-gen VPN) codebase \u0026lt; 4000 lines (OpenVPN is 400k+) easy config low latency, high throughput great for private networks and remote work This guide helps you build a private network that is invisible to the public internet.\nCore concepts What is WireGuard? WireGuard is a modern, minimal VPN protocol in the Linux kernel, using modern crypto (ChaCha20, Curve25519, etc.).\nHighlights:\nvery fast simple config files strong security by default stable roaming (mobile network switching works) Terminology Term Meaning Interface WireGuard virtual interface, e.g., wg0 Peer A node (client/server) PrivateKey private key (keep secret) PublicKey public key (identity to peers) AllowedIPs IP ranges allowed for a peer WireGuard is peer-to-peer and does not need a certificate system like OpenVPN.\nWireGuard vs OpenVPN Item WireGuard OpenVPN Performance very fast (kernel) slower (user space) Config complexity minimal complex Security modern by default configurable but easy to misconfigure Stability high average Roaming excellent weak Code size ~4000 lines ~400k lines One-line summary: if you want speed, simplicity, and stability, choose WireGuard.\nPractical setup: Build WireGuard on a server Example uses Ubuntu/Debian.\n1. Install WireGuard sudo apt update sudo apt install wireguard -y 2. Generate server keys wg genkey | tee server_private.key | wg pubkey \u0026gt; server_public.key 3. Create server config /etc/wireguard/wg0.conf [Interface] Address = 10.8.0.1/24 ListenPort = 51820 PrivateKey = \u0026lt;server_private_key\u0026gt; # client peers will be added below 4. Start WireGuard sudo wg-quick up wg0 Enable on boot:\nsudo systemctl enable wg-quick@wg0 Create a mobile client (Peer) 1. Generate client keys wg genkey | tee phone_private.key | wg pubkey \u0026gt; phone_public.key 2. Add peer on the server Edit /etc/wireguard/wg0.conf:\n[Peer] PublicKey = \u0026lt;phone_public_key\u0026gt; AllowedIPs = 10.8.0.2/32 Restart WireGuard:\nsudo wg-quick down wg0 sudo wg-quick up wg0 3. Create client config (phone) phone.conf:\n[Interface] PrivateKey = \u0026lt;phone_private_key\u0026gt; Address = 10.8.0.2/32 DNS = 1.1.1.1 [Peer] PublicKey = \u0026lt;server_public_key\u0026gt; Endpoint = \u0026lt;your-public-ip-or-domain\u0026gt;:51820 AllowedIPs = 0.0.0.0/0 PersistentKeepalive = 25 Import on mobile via QR code Install:\nAndroid: WireGuard (Google Play) iOS: WireGuard (App Store) Generate QR code:\nqrencode -t ansiutf8 \u0026lt; phone.conf Scan in the WireGuard app.\nAfter connecting, your phone gets:\nInternal IP: 10.8.0.2 You can access:\nYour server: 10.8.0.1 Examples:\nSSH: ssh user@10.8.0.1 RDP: 10.8.0.1 Web: http://10.8.0.1:xxxx Why this works 1. Peer-to-peer design No certificates, no TLS, no expiry issues.\n2. Keys are identity Each device has a key pair as its identity.\n3. Kernel implementation WireGuard runs in the kernel crypto subsystem for high efficiency.\n4. Designed for modern networks Seamless roaming between 4G and Wi-Fi on mobile.\nCommon pitfalls Error 1: port 51820/udp not open You must open:\nUDP 51820 Error 2: wrong AllowedIPs If you set:\nAllowedIPs = 0.0.0.0/0 all phone traffic goes through VPN.\nTo access only the LAN:\nAllowedIPs = 10.8.0.0/24 Error 3: IP forwarding disabled echo \u0026#34;net.ipv4.ip_forward=1\u0026#34; \u0026gt;\u0026gt; /etc/sysctl.conf sysctl -p Best practices Generate a unique key pair for each device Do not share config files Use a fixed server IP (or DDNS) Restrict non-VPN traffic via UFW Bind backend services to internal IP only Example: SSH listens only on\nListenAddress 10.8.0.1 Summary WireGuard is ideal for:\nhome or work private networks hiding server ports secure remote access building a private LAN This guide covers principles, install, config, mobile access, and best practices. You can now:\ndeploy WireGuard quickly on any server access your private network securely avoid public port exposure and scanning If you need:\nDocker-based WireGuard Windows as the server multi-user management advanced routing Let me know and I can extend the series.\nReferences WireGuard docs: https://www.wireguard.com/ Linux man pages: man wg, man wg-quick WireGuard paper: https://www.wireguard.com/papers/wireguard.pdf Meta (SEO) Keywords: WireGuard tutorial, VPN private network, self-hosted VPN, server security, WireGuard vs OpenVPN Reading time: 8-12 minutes Tags: VPN, Linux, security, private network, tutorial Meta description: A comprehensive WireGuard VPN tutorial for building a fast and secure private network, with setup and mobile access. Call to Action (CTA) If this helped, star it, ask questions, or tell me your WireGuard scenario. I can help you tailor the config and extend the series.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/wireguard-vpn-neiwang/","summary":"\u003ch1 id=\"wireguard-full-guide-build-a-secure-high-speed-private-network-vpn-tutorial\"\u003eWireGuard Full Guide: Build a Secure High-Speed Private Network (VPN Tutorial)\u003c/h1\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract:\u003c/strong\u003e\nA beginner-to-intermediate WireGuard VPN guide. Learn to build a fast, secure private network and enforce a zero-exposure model where services are only reachable through VPN.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003ePeople who want to hide server or PC ports behind a VPN\u003c/li\u003e\n\u003cli\u003eUsers who want to reduce scanning and brute force risk\u003c/li\u003e\n\u003cli\u003eAnyone building a private LAN or remote access to home\u003c/li\u003e\n\u003cli\u003eLinux/Windows users, developers, and ops beginners\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"background-and-motivation-why-wireguard\"\u003eBackground and motivation: Why WireGuard?\u003c/h2\u003e\n\u003cp\u003eIf you expose ports to the public internet (SSH, databases, admin panels), you will face:\u003c/p\u003e","title":"WireGuard Full Guide: Build a Secure High-Speed Private Network (VPN Tutorial)"},{"content":"Build a Hugo Blog with GitHub Pages in 10 Minutes Subtitle / Abstract This guide takes you from zero to a deployed Hugo blog on GitHub Pages with GitHub Actions. It is beginner-friendly and explains the key moving parts.\nTarget readers Hugo beginners Developers who want a quick technical blog Users of GitHub Pages and GitHub Actions Anyone who wants free static hosting Background / Motivation Common pain points when publishing a blog:\nmanual uploads scattered deployment steps confusing GitHub Pages setup theme assets failing to build The combo Hugo + GitHub Pages + GitHub Actions solves these:\nHugo is fast Pages is free Actions deploys on every push Core concepts Hugo: static site generator GitHub Pages: free static hosting GitHub Actions: CI pipeline to build and deploy PaperMod: popular Hugo theme Steps: from local to online Step 1: Create a Hugo site hugo new site myblog cd myblog git init Add PaperMod:\ngit submodule add https://github.com/adityatelange/hugo-PaperMod.git themes/PaperMod Set config.toml:\nbaseURL = \u0026#34;https://\u0026lt;your-username\u0026gt;.github.io/\u0026lt;repo\u0026gt;/\u0026#34; languageCode = \u0026#34;en-us\u0026#34; title = \u0026#34;My Blog\u0026#34; theme = \u0026#34;PaperMod\u0026#34; Step 2: Push to GitHub git remote add origin git@github.com:\u0026lt;your-username\u0026gt;/\u0026lt;repo\u0026gt;.git git add . git commit -m \u0026#34;init blog\u0026#34; git push -u origin main Step 3: Enable GitHub Pages In your repo:\nSettings -\u0026gt; Pages Build and deployment -\u0026gt; Source = GitHub Actions If the repo is private, set it public or enable Pages for private repositories (paid). Otherwise you may see 404.\nStep 4: Add GitHub Actions workflow Create .github/workflows/hugo.yml:\nname: Deploy Hugo site to Pages on: push: branches: [\u0026#34;main\u0026#34;] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: submodules: true - uses: peaceiris/actions-hugo@v2 with: hugo-version: \u0026#34;0.120.4\u0026#34; - name: Build run: hugo --minify - name: Upload uses: actions/upload-pages-artifact@v2 with: path: ./public deploy: needs: build runs-on: ubuntu-latest permissions: pages: write id-token: write environment: name: github-pages url: ${{ steps.deployment.outputs.page_url }} steps: - name: Deploy to GitHub Pages id: deployment uses: actions/deploy-pages@v2 Step 5: Create a post and publish hugo new posts/hello-world.md Edit front matter, set draft: false, write your content, and push.\nCommon issues draft: true means the post will not show incorrect baseURL causes broken links missing submodules in Actions -\u0026gt; theme missing Pages source not set to Actions Summary Hugo builds fast static pages GitHub Actions automates build and deploy GitHub Pages hosts for free Once set, you only need to write Markdown and push.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/how-to-build-a-blog-system/","summary":"\u003ch1 id=\"build-a-hugo-blog-with-github-pages-in-10-minutes\"\u003eBuild a Hugo Blog with GitHub Pages in 10 Minutes\u003c/h1\u003e\n\u003ch2 id=\"subtitle--abstract\"\u003eSubtitle / Abstract\u003c/h2\u003e\n\u003cp\u003eThis guide takes you from zero to a deployed Hugo blog on GitHub Pages with GitHub Actions. It is beginner-friendly and explains the key moving parts.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHugo beginners\u003c/li\u003e\n\u003cli\u003eDevelopers who want a quick technical blog\u003c/li\u003e\n\u003cli\u003eUsers of GitHub Pages and GitHub Actions\u003c/li\u003e\n\u003cli\u003eAnyone who wants free static hosting\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eCommon pain points when publishing a blog:\u003c/p\u003e","title":"How to Build a Blog System"},{"content":"How to Publish with Hugo: From Markdown to Online Blog Subtitle / Abstract This guide explains how to create, manage, and publish Hugo posts: front matter, drafts, images, directory structure, local preview, and deployment.\nTarget readers Hugo beginners Developers building a technical blog with Hugo Writers using Markdown + static sites Users of PaperMod, DoIt, and similar themes Background / Motivation After setting up a Hugo site, common questions include:\nWhere should posts go? How should front matter be written? Where do images live? Why does the post show locally but not online? How do drafts and publish dates work? How do posts appear on the homepage? This guide provides practical steps and best practices for the full publishing flow.\nCore concepts 1) Content directory Hugo posts live under content/:\ncontent/ posts/ my-first-post.md 2) Front matter Top metadata controls title, date, draft, tags, etc.\n--- title: \u0026#34;My Title\u0026#34; date: 2024-08-26 draft: false tags: [\u0026#34;hugo\u0026#34;, \u0026#34;blog\u0026#34;] --- 3) Draft Drafts are not built. Use hugo server -D to preview drafts.\n4) Section content/posts/* maps to the /posts/ section.\nSteps Step 1: Create a new post hugo new posts/how-to-publish.md This creates:\ncontent/posts/how-to-publish.md Default content:\n--- title: \u0026#34;How to Publish\u0026#34; date: 2024-08-26T10:00:00+08:00 draft: true --- Step 2: Edit front matter A typical PaperMod-friendly front matter:\n--- title: \u0026#34;How to Publish with Hugo\u0026#34; date: 2024-08-26T10:00:00+08:00 draft: false tags: [\u0026#34;hugo\u0026#34;, \u0026#34;blog\u0026#34;, \u0026#34;static-site\u0026#34;] categories: [\u0026#34;tutorial\u0026#34;] summary: \u0026#34;A complete guide from writing to publishing.\u0026#34; cover: image: \u0026#34;/images/hugo-cover.png\u0026#34; alt: \u0026#34;Hugo cover\u0026#34; caption: \u0026#34;Hugo blog cover\u0026#34; --- Step 3: Write content Use Markdown and keep headings consistent. Add code blocks where needed.\nStep 4: Add images Common options:\nstatic/images/... and reference as /images/... Page bundles: content/posts/my-post/index.md with images in the same folder Step 5: Preview locally hugo server -D Open http://localhost:1313.\nStep 6: Build and deploy hugo Static output goes to public/. Deploy via GitHub Pages, Netlify, or your server.\nCommon pitfalls draft: true prevents publishing Wrong folder (e.g., outside content/) Missing or incorrect baseURL Images referenced with wrong paths Theme config not loaded Summary Create posts with hugo new Write front matter carefully Preview with hugo server -D Build with hugo and deploy If you want a complete deployment pipeline with GitHub Actions, see the next guide.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/how-to-publish-by-hugo/","summary":"\u003ch1 id=\"how-to-publish-with-hugo-from-markdown-to-online-blog\"\u003eHow to Publish with Hugo: From Markdown to Online Blog\u003c/h1\u003e\n\u003ch2 id=\"subtitle--abstract\"\u003eSubtitle / Abstract\u003c/h2\u003e\n\u003cp\u003eThis guide explains how to create, manage, and publish Hugo posts: front matter, drafts, images, directory structure, local preview, and deployment.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eHugo beginners\u003c/li\u003e\n\u003cli\u003eDevelopers building a technical blog with Hugo\u003c/li\u003e\n\u003cli\u003eWriters using Markdown + static sites\u003c/li\u003e\n\u003cli\u003eUsers of PaperMod, DoIt, and similar themes\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eAfter setting up a Hugo site, common questions include:\u003c/p\u003e","title":"How to Publish with Hugo"},{"content":"Title (accurate and keyword-rich) Write Clear Requirements with Issue Templates: A Complete Guide to GitHub Issue Forms\nSubtitle / Abstract This post teaches you how to configure GitHub Issue templates for feature requests and bug reports, including folder structure, YAML forms, Markdown templates, and common pitfalls. It is ideal for teams that want clearer requirements and less back-and-forth.\nTarget readers This article is for:\nBackend/frontend/full-stack engineers who create Issues regularly Leads/TLs/architects who want standardized requirement intake Mid-level developers familiar with GitHub but new to Issue templates Beginners can follow, but basic GitHub knowledge is assumed.\nBackground / Motivation: Why templates? Without templates, you often hear:\n\u0026ldquo;What is the background?\u0026rdquo; \u0026ldquo;Which modules are affected?\u0026rdquo; \u0026ldquo;What is the acceptance criteria?\u0026rdquo; \u0026ldquo;How high is the priority?\u0026rdquo; A one-line Issue like:\n\u0026ldquo;Add export feature\u0026rdquo;\nwill confuse everyone.\nLong-term pain points:\nHigh communication cost: details must be asked repeatedly Information asymmetry: the requester knows, but the Issue does not Hard to plan: no priority or acceptance criteria Hard to trace: months later, nobody knows the intent GitHub Issue Templates are structured questions that guide good input:\nenforce or guide required fields auto-label and auto-prefix titles support form UI validation Goal: make every new Issue understandable at a glance.\nCore concepts 1. Issue Template A preset format shown when creating a new Issue Can be Markdown text or YAML form 2. Markdown template Old and simple Essentially a prefilled Markdown file Path: .github/ISSUE_TEMPLATE/xxx.md or .github/ISSUE_TEMPLATE.md 3. YAML Issue form New and recommended Form UI with inputs and dropdowns Submissions are converted to Markdown in the Issue body Path: .github/ISSUE_TEMPLATE/xxx.yml 4. config.yml Path: .github/ISSUE_TEMPLATE/config.yml Controls: Whether blank Issues are allowed Which templates are displayed Practice guide / Steps overview Create .github/ISSUE_TEMPLATE Create a Feature template (YAML form) (Optional) Create a Bug template Configure config.yml for blank Issue behavior Commit and push to GitHub Verify in the GitHub UI Step 1: Create the template folder mkdir -p .github/ISSUE_TEMPLATE Structure:\nyour-repo/ .github/ ISSUE_TEMPLATE/ # yml / md files go here src/ ... Step 2: Feature request template (YAML form) Create .github/ISSUE_TEMPLATE/feature-request.yml:\nname: \u0026#34;Feature Request\u0026#34; description: \u0026#34;Use for new features or requirement changes\u0026#34; title: \u0026#34;[Feature] \u0026#34; labels: - \u0026#34;feature\u0026#34; - \u0026#34;enhancement\u0026#34; body: - type: markdown attributes: value: | Thanks for submitting a feature request. Please be as clear as possible for evaluation and planning. - type: input id: module attributes: label: Affected module description: Service or module involved (API, crawler, UI, etc.) placeholder: e.g. export API / attachment viewer validations: required: true - type: textarea id: background attributes: label: Background / scenario description: Why do we need this? What problem are we solving? placeholder: | Describe business context, roles, usage scenario, pain points... validations: required: true - type: textarea id: description attributes: label: Requirement description description: Describe desired behavior from the user perspective. placeholder: | 1. Add ... on page ... 2. When user does ..., the system should ... 3. Edge cases to support: ... validations: required: true - type: textarea id: acceptance_criteria attributes: label: Acceptance criteria description: What counts as \u0026#34;done\u0026#34;? Helps testing and review. placeholder: | - [ ] Scenario 1: ... - [ ] Scenario 2: ... - [ ] Performance / security requirements: ... validations: required: true - type: dropdown id: priority attributes: label: Priority description: For planning and ordering options: - P0 (must be done this iteration) - P1 (high) - P2 (normal) - P3 (low) default: 2 validations: required: false - type: textarea id: extra attributes: label: Extra info description: Related APIs, docs, designs, screenshots, linked issues placeholder: | - API docs: - Design / prototype: - Related Issue / ticket: validations: required: false Effect:\nA \u0026ldquo;Feature Request\u0026rdquo; option appears on Issue creation The form replaces plain text Labels feature and enhancement are added Title is prefixed with [Feature] Key fields are required Step 3 (Optional): Bug report template Create .github/ISSUE_TEMPLATE/bug-report.yml:\nname: \u0026#34;Bug Report\u0026#34; description: \u0026#34;Use for bugs and exceptions\u0026#34; title: \u0026#34;[Bug] \u0026#34; labels: - \u0026#34;bug\u0026#34; body: - type: textarea id: summary attributes: label: Summary placeholder: Briefly describe the problem validations: required: true - type: textarea id: steps attributes: label: Steps to reproduce placeholder: | 1. Open ... 2. Click ... 3. See ... validations: required: true - type: textarea id: expected attributes: label: Expected result validations: required: true - type: textarea id: actual attributes: label: Actual result validations: required: true - type: textarea id: extra attributes: label: Extra info description: Logs, screenshots, environment details validations: required: false Now your team can separate feature requests from bugs clearly.\nStep 4: Configure config.yml Create .github/ISSUE_TEMPLATE/config.yml:\nblank_issues_enabled: false # disallow blank Issues, force templates contact_links: - name: Internal request system url: https://example.com/your-internal-system about: If this is a formal request, create it in the internal system first. If you do not have an internal system, remove contact_links or replace with your wiki.\nblank_issues_enabled: false forces template usage and prevents empty Issues.\nStep 5: Commit and push git add .github/ISSUE_TEMPLATE/* git commit -m \u0026#34;chore: add GitHub issue templates for feature and bug\u0026#34; git push Templates take effect on the default branch (usually main or master).\nStep 6: Verify in GitHub UI Open your repo Click Issues Click New issue You should see template choices such as:\nFeature Request Bug Report (Optional) Open a blank issue If blank_issues_enabled: false, the blank option disappears.\nMinimal runnable example If you only need a minimal Feature template, do this:\n1) Create folder:\nmkdir -p .github/ISSUE_TEMPLATE 2) Create .github/ISSUE_TEMPLATE/feature-request.yml:\nname: \u0026#34;Feature Request\u0026#34; description: \u0026#34;Use for new features or requirement changes\u0026#34; title: \u0026#34;[Feature] \u0026#34; labels: [\u0026#34;feature\u0026#34;] body: - type: textarea id: background attributes: label: Background / scenario placeholder: Briefly describe why this is needed validations: required: true - type: textarea id: description attributes: label: Requirement description placeholder: | What should the system do? How will users use it? validations: required: true - type: textarea id: acceptance attributes: label: Acceptance criteria placeholder: | - [ ] Scenario 1: ... - [ ] Scenario 2: ... validations: required: true Then:\ngit add .github/ISSUE_TEMPLATE/feature-request.yml git commit -m \u0026#34;add minimal feature request issue template\u0026#34; git push Why YAML forms instead of Markdown? Benefits of YAML forms Required field validation for background/requirements/acceptance Friendly UI for non-technical teammates Clear structure for reading and automation Auto labels and title prefixes Markdown template pros and cons Markdown templates are fine, but:\nPros: simple and compatible good for technical teams Cons: cannot enforce required fields UI is less friendly for product/ops roles If you want team standards and clarity, YAML forms are better. For small personal projects, Markdown is enough.\nCommon issues and notes 1. Template not working? Check:\ncorrect path: .github/ISSUE_TEMPLATE/xxx.yml or .github/ISSUE_TEMPLATE/xxx.md default branch: template must be on main/master case sensitivity: ISSUE_TEMPLATE must match exactly 2. Template updated but UI unchanged? browser cache: refresh or use private mode confirm you pushed to GitHub for forks, templates are per-repo and do not inherit upstream 3. Org-level templates? You can configure templates in an org-wide .github repo. Repos without templates will use the org default.\n4. YAML errors? YAML is sensitive to indentation and spaces GitHub may ignore or error on malformed YAML Use editor validation (VS Code is great) Best practices Start simple: one feature template first Require core fields: background, description, acceptance Standardize title prefixes: [Feature], [Bug] Auto-label to reduce manual maintenance Keep forms short; balance clarity and friction Review after 1-2 months and refine fields Conclusion This guide covers:\nIssue template concepts (YAML forms vs Markdown) A full Feature template plus optional Bug template The end-to-end workflow: create -\u0026gt; write -\u0026gt; push -\u0026gt; verify Why YAML forms are recommended If you apply these steps, requirement quality will improve immediately.\nReferences and further reading GitHub Docs: Issue and pull request templates Keywords: github issue template yaml github issue forms github .github/ISSUE_TEMPLATE examples Meta Reading time: 8-12 minutes Tags: GitHub, collaboration, Issue templates, team standards, requirements SEO keywords: GitHub Issue template GitHub Issue Template config YAML Issue Form tutorial Feature request template Meta description: A complete guide to configuring GitHub Issue templates (YAML forms and Bug templates) with best practices and pitfalls. Call to Action (CTA) If you finished reading, do this now:\nPick a GitHub repo you use often Create .github/ISSUE_TEMPLATE/feature-request.yml Push and open a test Issue to see the effect ","permalink":"https://shio-chan-dev.github.io/jeanblog/notes/git-notes/write-clear-issues-from-zero-to-template/","summary":"\u003ch2 id=\"title-accurate-and-keyword-rich\"\u003eTitle (accurate and keyword-rich)\u003c/h2\u003e\n\u003cp\u003e\u003cstrong\u003eWrite Clear Requirements with Issue Templates: A Complete Guide to GitHub Issue Forms\u003c/strong\u003e\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"subtitle--abstract\"\u003eSubtitle / Abstract\u003c/h2\u003e\n\u003cp\u003eThis post teaches you how to configure GitHub Issue templates for feature requests and bug reports, including folder structure, YAML forms, Markdown templates, and common pitfalls. It is ideal for teams that want clearer requirements and less back-and-forth.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cp\u003eThis article is for:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eBackend/frontend/full-stack engineers who create Issues regularly\u003c/li\u003e\n\u003cli\u003eLeads/TLs/architects who want standardized requirement intake\u003c/li\u003e\n\u003cli\u003eMid-level developers familiar with GitHub but new to Issue templates\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eBeginners can follow, but basic GitHub knowledge is assumed.\u003c/p\u003e","title":"Write Clear Issues with Templates: From Zero to GitHub Issue Forms"},{"content":"Title How to Write a Qualified API Document: From Swagger to Modern OpenAPI\nSubtitle / Abstract Want developers to actually enjoy using your API? This article covers the structure, examples, and best practices of high-quality API documentation based on Swagger/OpenAPI (originally by Tony Tam).\nTarget readers Beginners who want a standard API doc structure Mid-level developers improving maintainability Architects and leads defining API standards Background / Motivation Common problems in API docs:\ninconsistent format out-of-date content not usable for automation or testing Tony Tam introduced Swagger (renamed to OpenAPI) in 2010 to solve this. It is now the de facto standard for REST API docs, used by Google, Amazon, Stripe, and more.\nCore concepts Concept Description API doc Technical specification of how to call an API and interpret requests/responses Swagger/OpenAPI Standard to define, generate, and test REST APIs Endpoint A concrete path like /users/{id} Schema Field structure for requests and responses Practical steps Define a clear structure\nOverview Authentication Endpoints Schemas Errors and examples Use OpenAPI (YAML is recommended)\nRecommended tools\nEditors: Swagger Editor, Stoplight Studio, VS Code + YAML Docs: Swagger UI / ReDoc Auto-gen: Springdoc, FastAPI, NestJS Runnable example (OpenAPI) openapi: 3.0.0 info: title: User Management API version: 1.0.0 description: APIs for user management. servers: - url: https://api.example.com/v1 paths: /users/{id}: get: summary: Get user by ID parameters: - name: id in: path required: true description: User ID schema: type: string responses: \u0026#39;200\u0026#39;: description: OK content: application/json: schema: $ref: \u0026#39;#/components/schemas/User\u0026#39; \u0026#39;404\u0026#39;: description: User not found components: schemas: User: type: object properties: id: type: string description: Unique user id name: type: string description: User name email: type: string description: Email address You can import this into Swagger Editor for visualization and testing.\nExplanation Why OpenAPI?\nStandardized: avoid custom formats Automated: generate SDKs, tests, mocks Interactive: Swagger UI allows live testing Alternatives:\nRAML (MuleSoft) API Blueprint (documentation-focused) OpenAPI wins because of its tooling ecosystem.\nCommon pitfalls Problem Cause Fix Docs and code drift manual updates auto-generate from code (FastAPI, Springdoc) Schema too complex deep nesting split models with $ref Missing fields in examples no mock testing use mock server to validate Best practices Version your API paths (e.g., /v1/) Standardize error format ({code, message, data}) Keep docs in sync with code Add real examples Validate OpenAPI in CI Summary Great API docs are not just documentation but a collaboration bridge. Swagger/OpenAPI is about making APIs machine-readable and human-usable. With the right structure and tools, your APIs become easier to maintain and test.\nReferences OpenAPI Specification Swagger Editor ReDoc Microsoft API Design Guidelines Meta Reading time: 7 minutes Tags: API docs, Swagger, OpenAPI, standards, Tony Tam SEO keywords: API documentation standard, Swagger tutorial, OpenAPI example, RESTful design Meta description: Based on Swagger/OpenAPI, this guide explains API doc structure, examples, and best practices. Call to Action (CTA) Try it now:\nOpen Swagger Editor and paste the YAML above Or follow this series on API design; next post: auto-generate SDKs with OpenAPI ","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/api-standards/","summary":"\u003ch1 id=\"title\"\u003eTitle\u003c/h1\u003e\n\u003cp\u003e\u003cstrong\u003eHow to Write a Qualified API Document: From Swagger to Modern OpenAPI\u003c/strong\u003e\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"subtitle--abstract\"\u003eSubtitle / Abstract\u003c/h2\u003e\n\u003cp\u003eWant developers to actually enjoy using your API? This article covers the structure, examples, and best practices of high-quality API documentation based on Swagger/OpenAPI (originally by Tony Tam).\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eBeginners who want a standard API doc structure\u003c/li\u003e\n\u003cli\u003eMid-level developers improving maintainability\u003c/li\u003e\n\u003cli\u003eArchitects and leads defining API standards\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eCommon problems in API docs:\u003c/p\u003e","title":"API Standards"},{"content":"For a system, a single thread should be a single assistant. We should provide each user with one assistant and optimize that assistant.\nProviding many parallel threads per user is too expensive and unnecessary.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/thoughts-on-ai-systems/","summary":"\u003cp\u003eFor a system, a single thread should be a single assistant. We should provide each user with one assistant and optimize that assistant.\u003c/p\u003e\n\u003cp\u003eProviding many parallel threads per user is too expensive and unnecessary.\u003c/p\u003e","title":"Thoughts on AI Systems"},{"content":"Run Gitea Locally: Your Private GitHub (with Existing Repo Import) Subtitle / Abstract: This guide walks you through installing the lightweight Git server Gitea on your local machine. No root required, no system pollution. Manage, browse, and push projects like GitHub, and import existing repos.\nTarget readers: Personal developers, indie engineers, and small team leads with basic Git knowledge.\nBackground / Motivation Many developers want:\nto host code inside a company or LAN to avoid cloud platforms (GitHub/Gitee) to have a web UI, pull requests, and code browsing GitLab is heavy (often multiple GB of RAM). Gitea is:\nlightweight a single binary supports PR, Wiki, Issues, CI/CD In minutes, you get a private \u0026ldquo;mini GitHub\u0026rdquo;.\nCore concepts Term Description GitLab most powerful open-source Git platform, heavy resource usage Gitea lightweight self-hosted Git service with GitHub-like UI Bare repo repo with history only, no working tree Pull Request merge request from one branch to another SQLite default lightweight database for Gitea Setup steps 1) Prepare environment Supported OS: Linux / macOS / Windows Recommended: RAM \u0026gt;= 512MB, disk \u0026gt;= 1GB\n2) Create directory and download mkdir -p ~/gitea cd ~/gitea wget -O gitea https://dl.gitea.io/gitea/1.22.0/gitea-1.22.0-linux-amd64 chmod +x gitea 3) Start Gitea ./gitea web --port 3000 Open: http://localhost:3000\n4) Install wizard Fill in:\nDB type: SQLite3 Repo root: /home/\u0026lt;username\u0026gt;/gitea/repos Base URL: http://localhost:3000 Create admin account Runnable example: push an existing repo Assume your local project is /home/gong/projects/scrapy:\nCreate a repo named scrapy in Gitea In your project directory: cd ~/projects/scrapy git remote set-url origin http://localhost:3000/JeanphiloGong/scrapy.git git push -u origin --all git push -u origin --tags Refresh the web UI to see full history.\nRegister as a system service 1) Prerequisites Assume Gitea is installed at:\n/home/gong/gitea Binary path:\n/home/gong/gitea/gitea Run as user gong and do not use root.\n2) Create systemd service Create /etc/systemd/system/gitea.service:\n[Unit] Description=Gitea (Self-hosted Git Service) After=network.target [Service] # User and group User=gong Group=gong # Working directory WorkingDirectory=/home/gong/gitea # Start command ExecStart=/home/gong/gitea/gitea web --config /home/gong/gitea/custom/conf/app.ini # Restart policy Restart=always RestartSec=10s # Environment (optional) Environment=USER=gong HOME=/home/gong GITEA_WORK_DIR=/home/gong/gitea # Security PrivateTmp=true ProtectSystem=full NoNewPrivileges=true [Install] WantedBy=multi-user.target Notes:\nWorkingDirectory is the Gitea directory ExecStart defines the launch command Restart=always ensures auto-restart 3) Load and enable # Reload systemd sudo systemctl daemon-reload # Enable auto-start sudo systemctl enable gitea # Start service sudo systemctl start gitea # Check status sudo systemctl status gitea Expected:\nActive: active (running) 4) View logs Real-time logs:\nsudo journalctl -u gitea -f History logs:\nsudo journalctl -u gitea --since \u0026#34;1 hour ago\u0026#34; Explanation Gitea is a Go-based self-hosted Git service. It manages local Git repos (e.g., ~/gitea/repos) and exposes GitHub-like operations via HTTP/SSH.\nCompared to git init --bare (just storage), Gitea adds web UI, users, PRs, and Wiki.\nCommon issues Issue Cause Fix Port 3000 in use Another service uses it Run ./gitea web --port 8080 Permission errors Gitea runs as current user Check repo directory permissions Push fails Repo init conflict Do not select \u0026ldquo;Initialize with README\u0026rdquo; Push slow/timeouts Using HTTP not SSH Configure SSH keys for faster pushes Best practices SQLite is enough for personal or small teams\nRun in background: nohup ./gitea web \u0026amp;\nBackup regularly:\n~/gitea/repos/ ~/gitea/data/gitea.db ~/gitea/custom/conf/app.ini If the team grows, move to a server or Docker\nSummary You have:\nDeployed Gitea locally Avoided port conflicts Pushed existing repos to Gitea Gained a web UI, PRs, and history You now have your own private \u0026ldquo;GitHub\u0026rdquo;.\nReferences Gitea docs Gitea downloads Pro Git book Forgejo Meta Reading time: 8 minutes Tags: Git, Gitea, self-hosted, DevOps, version-control SEO keywords: local Gitea install, self-hosted Git server, private GitHub, import local repo Meta description: Set up a lightweight local Git server with Gitea, including PRs, web UI, and repo management. Call to Action (CTA) Try it now:\nRun the install commands Visit http://localhost:3000 Create your first repo Push a project If you want automation and backup scripts, leave a comment and I will share a follow-up.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/notes/git-notes/configure-gitea/","summary":"\u003ch1 id=\"run-gitea-locally-your-private-github-with-existing-repo-import\"\u003eRun Gitea Locally: Your Private GitHub (with Existing Repo Import)\u003c/h1\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract:\u003c/strong\u003e\nThis guide walks you through installing the lightweight Git server Gitea on your local machine. No root required, no system pollution. Manage, browse, and push projects like GitHub, and import existing repos.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTarget readers:\u003c/strong\u003e\nPersonal developers, indie engineers, and small team leads with basic Git knowledge.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h2\u003e\n\u003cp\u003eMany developers want:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eto host code inside a company or LAN\u003c/li\u003e\n\u003cli\u003eto avoid cloud platforms (GitHub/Gitee)\u003c/li\u003e\n\u003cli\u003eto have a web UI, pull requests, and code browsing\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eGitLab is heavy (often multiple GB of RAM). Gitea is:\u003c/p\u003e","title":"How to Set Up Gitea"},{"content":"Title From feat to fix: master Git commit conventions for collaboration and automation\nSubtitle / Abstract A practical guide to Conventional Commits. Learn commit types (feat:, fix:), write clean messages, and enable automatic changelogs and releases.\nTarget readers Beginners: new to Git, want better commit habits. Mid-level devs: want commits friendly to team and CI. Leads/architects: want a consistent team standard. Background / Motivation Most commit messages look like:\n\u0026ldquo;update code\u0026rdquo; \u0026ldquo;fix bug\u0026rdquo; \u0026ldquo;some changes\u0026rdquo;\nThey are readable short-term but useless long-term. As teams grow, it becomes hard to track intent or automate releases.\nConventional Commits provides a simple, unified format so commits are readable, traceable, and automatable.\nCore concept Conventional Commits define a commit message structure:\n\u0026lt;type\u0026gt;(\u0026lt;scope\u0026gt;): \u0026lt;subject\u0026gt; \u0026lt;body\u0026gt; \u0026lt;footer\u0026gt; type: commit type, e.g. feat, fix, docs scope: optional area, e.g. ui, api subject: short description (\u0026lt;= 50 chars) body: details (optional) footer: metadata (e.g., BREAKING CHANGE) Practical steps Set Git editor to Neovim (optional) git config --global core.editor \u0026#34;nvim\u0026#34; Write a standard commit message git commit -m \u0026#34;feat(lsp): support new nvim-lspconfig API\u0026#34; Structured commit example feat(lsp): update LSP config for new nvim-lspconfig - remove old lspconfig[server].setup - use new function call lspconfig(server, {...}) Enforce with tooling (optional) npm install -g commitlint @commitlint/config-conventional Create .commitlintrc.js:\nmodule.exports = { extends: [\u0026#34;@commitlint/config-conventional\u0026#34;] }; Runnable examples # new feature git commit -m \u0026#34;feat(auth): add two-factor login\u0026#34; # bug fix git commit -m \u0026#34;fix(ui): fix text invisibility in dark mode\u0026#34; # docs git commit -m \u0026#34;docs(readme): add usage notes\u0026#34; # refactor git commit -m \u0026#34;refactor(api): optimize auth logic\u0026#34; # performance git commit -m \u0026#34;perf(db): improve query cache\u0026#34; Explanation This standard comes from the Angular commit message format and became the Conventional Commits spec.\nBenefits:\nClear structure: see type and scope at a glance Machine-readable: auto changelog generation Easy integration: with semantic-release Alternatives:\nGitmoji (emoji commits) Semantic Versioning (release versioning) FAQ Question Answer Can I mix English and Chinese? Yes, but keep it consistent. Prefer English in the subject. What if one commit covers multiple types? Split into multiple commits. What if I cannot write a long message? At least explain \u0026ldquo;why\u0026rdquo; in one line. Is scope required? Optional, but recommended. Best practices One commit does one thing Subject in lowercase, no period First line \u0026lt;= 50 characters Blank line after subject, details start on line 3 Start with a verb (add, fix, update) Conclusion Commit conventions are a small investment with big returns: better history, smoother collaboration, and automation.\nReferences Conventional Commits Angular Commit Message Guidelines semantic-release Gitmoji Meta Reading time: about 6 minutes Tags: Git, standards, Conventional Commits, collaboration SEO keywords: Git commit conventions, Conventional Commits, feat fix refactor, best practices Meta description: A practical guide to writing clean commit messages with feat: and fix: and enabling automation. Call to Action (CTA) Try this in your next commit:\ngit commit -m \u0026#34;feat: first commit with conventional commits\u0026#34; Share your team conventions and lessons learned in the comments.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/notes/git-notes/git-commit-conventions-team-efficiency/","summary":"\u003ch3 id=\"title\"\u003eTitle\u003c/h3\u003e\n\u003cp\u003eFrom \u003ccode\u003efeat\u003c/code\u003e to \u003ccode\u003efix\u003c/code\u003e: master Git commit conventions for collaboration and automation\u003c/p\u003e\n\u003chr\u003e\n\u003ch3 id=\"subtitle--abstract\"\u003eSubtitle / Abstract\u003c/h3\u003e\n\u003cp\u003eA practical guide to Conventional Commits. Learn commit types (\u003ccode\u003efeat:\u003c/code\u003e, \u003ccode\u003efix:\u003c/code\u003e), write clean messages, and enable automatic changelogs and releases.\u003c/p\u003e\n\u003chr\u003e\n\u003ch3 id=\"target-readers\"\u003eTarget readers\u003c/h3\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eBeginners\u003c/strong\u003e: new to Git, want better commit habits.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMid-level devs\u003c/strong\u003e: want commits friendly to team and CI.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLeads/architects\u003c/strong\u003e: want a consistent team standard.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch3 id=\"background--motivation\"\u003eBackground / Motivation\u003c/h3\u003e\n\u003cp\u003eMost commit messages look like:\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u0026ldquo;update code\u0026rdquo;\n\u0026ldquo;fix bug\u0026rdquo;\n\u0026ldquo;some changes\u0026rdquo;\u003c/p\u003e","title":"Conventional Commits: Make Team Collaboration and Automation Efficient"},{"content":"Bengio-style ML Task Specification: From Research to Engineering Subtitle: How to write a reproducible, explainable, and comparable fine-tuning task document based on Yoshua Bengio\u0026rsquo;s methodology.\nReading time: 10 minutes Tags: ML documentation, fine-tuning, technical standards, deep learning practice Audience: mid to senior ML engineers, researchers, technical writers\n1. Why do we need this document? In ML projects, teams often run fine-tuning experiments. Months later, nobody can reproduce results or explain why a learning rate or LoRA layer was chosen.\nYoshua Bengio (one of the deep learning pioneers) proposed the idea that an ML task document must allow others to fully reproduce results and understand the design rationale. This became the Bengio-style ML project report structure, used by Google Research, Meta AI, OpenAI, and others.\n2. Core ideas of the Bengio template Item Description Source Yoshua Bengio, \u0026ldquo;Deep Learning Research Practice Notes\u0026rdquo; Goal Ensure ML experiments are reproducible, understandable, and comparable Use cases Fine-tuning, comparison studies, research reports, internal docs Benefits Clear structure, unified format, easy to convert into papers or internal whitepapers 3. Standard structure (nine sections) 1) Title page Document title (e.g., \u0026ldquo;Design and Implementation of Four Fine-Tuning Tasks\u0026rdquo;) Author, date, version Project or organization name 2) Abstract Briefly describe goals, model direction, and expected outcomes.\nExample:\nThis document describes the design, experiment plan, and evaluation for fine-tuning four architectures, comparing performance on a specific dataset.\n3) Background and motivation Explain:\ncurrent system limitations why fine-tuning is needed related papers and existing results scientific or business motivation Example: \u0026ldquo;Current LMs generalize poorly in low-resource domains, so we propose parameter-efficient fine-tuning on multilingual data.\u0026rdquo;\n4) Problem definition Define inputs/outputs, task type, and metrics:\nTask type: classification / generation / regression I/O format: text -\u0026gt; label or text -\u0026gt; text Metrics: accuracy, F1, BLEU, loss Constraints: compute budget, time, data privacy 5) Models and approach For each model, record:\narchitecture (Llama-3, Phi-3, Gemma, etc.) fine-tuning method (Full FT, LoRA, Adapter, QLoRA) key hyperparameters (batch size, epochs, LR) Model Method Dataset Epochs Learning Rate Model A LoRA Dataset X 5 3e-5 Model B Full Dataset X 3 2e-5 Model C Adapter Dataset Y 10 1e-4 Model D QLoRA Dataset Z 4 1e-5 6) Experimental setup Environment (GPU type, framework, version) Data split (train/val/test) Random seeds and reproducibility controls Logging tools (e.g., Weights \u0026amp; Biases) 7) Results and analysis Include:\nmetric tables and plots (accuracy, loss curves) model size vs performance trade-offs unexpected results and explanations Tip: include TensorBoard or matplotlib plots to show convergence trends.\n8) Conclusion and future work Which model performed best? Why (architecture, optimization)? Future directions (multi-task learning, quantization) 9) Appendix and references additional logs and code paths cited papers and open-source repos 4. Best practices Ensure reproducibility (version lock + seeds) Record motivation and assumptions per model Use tables/plots for comparability Use structured headings for team sharing and future papers 5. Summary The Bengio-style ML document is not just a format, it is a research culture. It makes collaboration transparent and results verifiable.\nReferences Yoshua Bengio, Deep Learning Research Practice Notes OpenAI Technical Reports (fine-tuning guides) Google Research: Effective ML Experiment Documentation Meta AI: Reproducibility Checklist for ML Models Call to Action Try writing your next fine-tuning report using this template. Use it as a team standard and improve reproducibility.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/how-to-write-a-perfect-ml-document/","summary":"\u003ch1 id=\"bengio-style-ml-task-specification-from-research-to-engineering\"\u003eBengio-style ML Task Specification: From Research to Engineering\u003c/h1\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle:\u003c/strong\u003e\nHow to write a reproducible, explainable, and comparable fine-tuning task document based on Yoshua Bengio\u0026rsquo;s methodology.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eReading time:\u003c/strong\u003e 10 minutes\n\u003cstrong\u003eTags:\u003c/strong\u003e ML documentation, fine-tuning, technical standards, deep learning practice\n\u003cstrong\u003eAudience:\u003c/strong\u003e mid to senior ML engineers, researchers, technical writers\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"1-why-do-we-need-this-document\"\u003e1. Why do we need this document?\u003c/h2\u003e\n\u003cp\u003eIn ML projects, teams often run fine-tuning experiments. Months later, nobody can reproduce results or explain why a learning rate or LoRA layer was chosen.\u003c/p\u003e","title":"How to Write a Perfect Machine Learning Document"},{"content":"Ping Works but SSH Fails: A Real Case of SSH vs VNC Subtitle: From connection refusal to protocol identification: understand TCP, SSH, and VNC Reading time: 7 minutes Tags: network troubleshooting, SSH, VNC, Linux, remote access SEO keywords: SSH connection failed, kex_exchange_identification, VNC port 5905, RFB 003.008, SSH vs VNC\nTarget readers Linux users, developers, and server admins Engineers learning systematic network troubleshooting Readers interested in SSH/VNC protocol behavior Background and motivation Have you seen this?\n\u0026ldquo;The server can be pinged, but SSH does not connect.\u0026rdquo;\nThis is common on hosts running multiple services (SSH, VNC, HTTP). This article walks through a real case: from \u0026ldquo;SSH failed\u0026rdquo; to finding out the port was actually VNC.\nSymptoms Command:\nssh chenhm@101.6.142.82 -p 5905 Output:\nkex_exchange_identification: Connection closed by remote host Connection closed by 101.6.142.82 port 5905 Ping test:\nping 101.6.142.82 It succeeds with no packet loss.\nSo we know:\nHost is online Network is reachable SSH handshake failed Core concepts Concept Meaning Ping ICMP test for connectivity only TCP Transport protocol that builds connections SSH Application protocol on top of TCP for secure login VNC / RFB Remote desktop protocol (Remote Frame Buffer) In short: Ping OK != SSH OK because they are different layers.\nTroubleshooting steps Step 1. Test TCP connectivity telnet 101.6.142.82 5905 Output:\nTrying 101.6.142.82... Connected to 101.6.142.82. Escape character is \u0026#39;^]\u0026#39;. RFB 003.008 Key clue: RFB 003.008 is the VNC handshake string (Remote Frame Buffer v3.8).\nThis means:\nPort 5905 is open It is running VNC, not SSH Why this happens After TCP connects, SSH sends a greeting like SSH-2.0-OpenSSH_8.x. A VNC server replies with RFB 003.008 instead. Protocol mismatch causes the SSH client to close, resulting in kex_exchange_identification.\nVerification Check the process on the port sudo ss -tlnp | grep 5905 Possible output:\nLISTEN 0 5 0.0.0.0:5905 ... /usr/bin/Xvnc Check SSH port sudo grep ^Port /etc/ssh/sshd_config If it returns Port 22, SSH is still on the default port.\nCorrect connection If you want the GUI Use a VNC client:\nvncviewer 101.6.142.82:5905 Or tools like:\nRealVNC TigerVNC TightVNC If you want the terminal Use SSH on the correct port:\nssh chenhm@101.6.142.82 -p 22 Common issues and fixes Problem Cause Fix Connection closed by remote host Protocol mismatch (SSH to VNC) Use the correct protocol SSH fails on all ports SSH service not running sudo systemctl start sshd VNC refused Firewall blocked firewall-cmd --add-port=5905/tcp --permanent SSH disconnected fail2ban ban check /var/log/auth.log Best practices Separate port and protocol: port number alone does not identify service type. Use telnet or nc to read protocol banners. Check logs: journalctl -u ssh, /var/log/auth.log. Define a clear port map for multi-service hosts: SSH -\u0026gt; 22 VNC -\u0026gt; 5900+ HTTP -\u0026gt; 80/8080 HTTPS -\u0026gt; 443 Summary This case shows:\nHow to separate network, transport, and application layer issues How to identify protocol banners (RFB vs SSH) How to find the real service on a port One-line conclusion:\nSSH is fine; you connected to the wrong service.\nReferences OpenSSH Manual TigerVNC GitHub [Linux man pages: ssh, telnet, ss, netstat] RFC 6143: The Remote Framebuffer Protocol (RFB) Call to Action Try this on your own server:\nnc \u0026lt;server_ip\u0026gt; \u0026lt;port\u0026gt; Check the banner you get back. You may discover more \u0026ldquo;hidden services\u0026rdquo;.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/ping-works-ssh-fails-fake-ssh-true-vnc/","summary":"\u003ch1 id=\"ping-works-but-ssh-fails-a-real-case-of-ssh-vs-vnc\"\u003ePing Works but SSH Fails: A Real Case of SSH vs VNC\u003c/h1\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle:\u003c/strong\u003e From connection refusal to protocol identification: understand TCP, SSH, and VNC\n\u003cstrong\u003eReading time:\u003c/strong\u003e 7 minutes\n\u003cstrong\u003eTags:\u003c/strong\u003e network troubleshooting, SSH, VNC, Linux, remote access\n\u003cstrong\u003eSEO keywords:\u003c/strong\u003e SSH connection failed, kex_exchange_identification, VNC port 5905, RFB 003.008, SSH vs VNC\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eLinux users, developers, and server admins\u003c/li\u003e\n\u003cli\u003eEngineers learning systematic network troubleshooting\u003c/li\u003e\n\u003cli\u003eReaders interested in SSH/VNC protocol behavior\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"background-and-motivation\"\u003eBackground and motivation\u003c/h2\u003e\n\u003cp\u003eHave you seen this?\u003c/p\u003e","title":"Ping Works but SSH Fails: A Real Case of SSH vs VNC"},{"content":"Below is a full draft based on your SSH startup and debugging process. It is ready for publication on a technical blog.\nRun SSH Without sudo on Linux (User-Level sshd Guide) Subtitle / Abstract: When you have no root access in a lab or restricted server environment, how do you start SSH and access your account remotely? This guide shows how to run sshd in your user directory, enable key login, and connect remotely.\nReading time: 10 minutes Target readers: intermediate Linux users, researchers, server users, DevOps learners Tags: SSH, sshd, Linux, remote access, non-root, system config SEO keywords: SSH without root, user-level sshd, openssh config, unprivileged ports, remote login failed\nBackground and motivation Many research servers and shared hosts do not grant sudo. But we still need:\nremote login file upload/download access from another machine By default, sshd requires root because it binds port 22 and reads system auth info. However, you can run a user-level SSH service in your home directory without changing system config.\nCore concepts Term Meaning sshd SSH server daemon that accepts connections user-space sshd sshd started by a normal user, no root privileges HostKey key pair used to encrypt SSH connections AuthorizedKeys list of public keys allowed to log in /etc/shadow password hash file; non-root cannot read Step-by-step: Start user-level SSH Step 1: Prepare config Create directory:\nmkdir -p ~/.ssh Create config ~/.ssh/ssh_config:\nPort 2222 ListenAddress 0.0.0.0 HostKey /home/\u0026lt;username\u0026gt;/.ssh/ssh_host_ed25519_key AuthorizedKeysFile /home/\u0026lt;username\u0026gt;/.ssh/authorized_keys PasswordAuthentication yes PubkeyAuthentication yes ChallengeResponseAuthentication no PidFile /home/\u0026lt;username\u0026gt;/.ssh/sshd.pid Note: do not use ~ in paths; OpenSSH will not expand it.\nStep 2: Generate host keys ssh-keygen -t ed25519 -f ~/.ssh/ssh_host_ed25519_key -N \u0026#34;\u0026#34; chmod 600 ~/.ssh/ssh_host_ed25519_key Step 3: Start user-level sshd /usr/bin/sshd -d -f ~/.ssh/ssh_config If you see:\nServer listening on 0.0.0.0 port 2222. then it is running. Test locally:\nssh -p 2222 \u0026lt;username\u0026gt;@localhost Explanation Why use port 2222? Ports \u0026lt; 1024 are privileged and require root. Use 2222 or 8022 instead.\nWhy \u0026ldquo;Could not get shadow information\u0026rdquo;? Non-root users cannot read /etc/shadow, so password auth fails. Use public keys instead.\nUse SSH key login (recommended) Generate local key (no email comment):\nssh-keygen -t ed25519 -C \u0026#34;\u0026#34; -f ~/.ssh/id_ed25519_noemail Add to authorized keys:\ncat ~/.ssh/id_ed25519_noemail.pub \u0026gt;\u0026gt; ~/.ssh/authorized_keys chmod 700 ~/.ssh chmod 600 ~/.ssh/authorized_keys Test login:\nssh -i ~/.ssh/id_ed25519_noemail -p 2222 \u0026lt;username\u0026gt;@localhost Allow remote access Ensure sshd listens on all addresses\nss -tlnp | grep 2222 If output is 127.0.0.1:2222, it is local only. Set:\nListenAddress 0.0.0.0 and restart sshd.\nFirewall and NAT\nIf external access shows \u0026ldquo;Connection refused\u0026rdquo;, firewall or NAT is blocking. If localhost works but public IP fails, open the port or configure forwarding. Run sshd in background\nnohup /usr/bin/sshd -f ~/.ssh/ssh_config -E ~/.ssh/sshd.log \u0026amp; tail -f ~/.ssh/sshd.log Common issues Issue Cause Fix Permission denied (password) cannot read /etc/shadow use key auth Address already in use port in use kill old process or change port Bind to port failed tried port 22 use port \u0026gt; 1024 Connection refused firewall / NAT block check listen address and policies Could not load host key HostKey path wrong use absolute path and chmod 600 Best practices Use ed25519 keys (secure and fast). In non-root environments, use key-only auth. Keep ~/.ssh at 700 and authorized_keys at 600. Do not expose your home directory or host keys. If remote access is needed, ensure ListenAddress 0.0.0.0 and open ports. Summary This guide shows how to:\nStart SSH without sudo Enable key auth to avoid /etc/shadow Support both local and remote login Debug common errors like \u0026ldquo;Connection refused\u0026rdquo; You can now run your own SSH service under a normal account.\nReferences OpenSSH manual man sshd_config RFC 4251: The Secure Shell Protocol Architecture Linux file permissions Call to Action (CTA) Try starting your own user-level sshd using the steps above. Save and share this guide for restricted environments. Share your SSH deployment pitfalls and fixes. Do you want a Markdown version with syntax highlighting ready for publishing?\n","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/enable-ssh-without-sudo/","summary":"\u003cp\u003eBelow is a full draft based on your SSH startup and debugging process. It is ready for publication on a technical blog.\u003c/p\u003e\n\u003chr\u003e\n\u003ch1 id=\"run-ssh-without-sudo-on-linux-user-level-sshd-guide\"\u003eRun SSH Without sudo on Linux (User-Level sshd Guide)\u003c/h1\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract:\u003c/strong\u003e\nWhen you have no root access in a lab or restricted server environment, how do you start SSH and access your account remotely? This guide shows how to run \u003ccode\u003esshd\u003c/code\u003e in your user directory, enable key login, and connect remotely.\u003c/p\u003e","title":"Run SSH Without sudo: User-Level sshd on Linux"},{"content":"Title: Run sshd Without sudo: Troubleshooting, nohup, and systemd (User-Level SSH)\nSubtitle / Abstract: How to run OpenSSH as a normal user, solve common errors like \u0026ldquo;connection refused\u0026rdquo;, \u0026ldquo;password auth failed\u0026rdquo;, and start-limit-hit, and keep sshd alive using nohup or systemd.\nTarget readers: Intermediate Linux users, researchers on shared servers, and anyone who needs SSH without root.\n1. Background / Motivation In some lab or shared environments, regular users do not have sudo. The default sshd service cannot be started. If you need to:\nremote into your Linux host use VS Code Remote or SCP but cannot change system config then you must run sshd in user space. This introduces issues: port conflicts, firewall rules, auth failures, and start-limit-hit.\n2. Core concepts Term Meaning sshd OpenSSH daemon that handles SSH logins user-level sshd sshd started by a normal user, no root privileges authorized_keys list of allowed public keys nohup run a process detached from the terminal systemd \u0026ndash;user user-level systemd instance for services start-limit-hit systemd pauses restarts after frequent failures 3. Full setup steps 1) Generate and configure SSH keys ssh-keygen -t ed25519 -C \u0026#34;\u0026#34; -f ~/.ssh/id_ed25519_noemail cat ~/.ssh/id_ed25519_noemail.pub \u0026gt;\u0026gt; ~/.ssh/authorized_keys chmod 700 ~/.ssh chmod 600 ~/.ssh/authorized_keys Ensure ~/.ssh/authorized_keys permissions are correct.\n2) Create a user-level sshd config ~/.ssh/ssh_config_pub\nPort 2223 ListenAddress 0.0.0.0 HostKey /home/chenhm/.ssh/ssh_host_ed25519_key AuthorizedKeysFile /home/chenhm/.ssh/authorized_keys PasswordAuthentication no PubkeyAuthentication yes PidFile /home/chenhm/.ssh/sshd_pub.pid LogLevel INFO SyslogFacility AUTH Generate host key:\nssh-keygen -t ed25519 -f ~/.ssh/ssh_host_ed25519_key -N \u0026#34;\u0026#34; 3) Start in debug mode /usr/bin/sshd -d -f ~/.ssh/ssh_config_pub If you see:\nServer listening on 0.0.0.0 port 2223\nthen it is running.\n4. Two ways to keep it running Option A: nohup (simplest) nohup /usr/bin/sshd -f ~/.ssh/ssh_config_pub -E ~/.ssh/sshd_pub.log \u0026gt;/dev/null 2\u0026gt;\u0026amp;1 \u0026amp; Runs after terminal closes\nCheck process:\nps -ef | grep \u0026#34;sshd -f\u0026#34; Check logs:\ntail -f ~/.ssh/sshd_pub.log Stop:\npkill -f \u0026#34;sshd -f /home/chenhm/.ssh/ssh_config_pub\u0026#34; Pros: no dependencies, works instantly. Cons: does not auto-start after reboot.\nOption B: systemd user service (auto-restart/auto-start) 1) Create the unit file ~/.config/systemd/user/sshd-user.service\n[Unit] Description=User-level SSH server [Service] Type=forking ExecStart=/usr/bin/sshd -f /home/chenhm/.ssh/ssh_config_pub -E /home/chenhm/.ssh/sshd_pub.log PIDFile=/home/chenhm/.ssh/sshd_pub.pid Restart=on-failure RestartSec=5 [Install] WantedBy=default.target 2) Enable and start systemctl --user daemon-reload systemctl --user enable sshd-user systemctl --user start sshd-user 3) Verify systemctl --user status sshd-user ss -tlnp | grep sshd You should see Active: active (running) and 0.0.0.0:2223.\n5. Troubleshooting table Error Cause Fix Connection refused sshd not listening on public interface or firewall blocked set ListenAddress 0.0.0.0, check ss -tlnp Permission denied (password) no access to /etc/shadow use public key auth Bind to port ... failed: Address already in use port already used by old sshd pkill -f \u0026quot;sshd -f\u0026quot; start-limit-hit systemd sees frequent crashes set Type=forking and PIDFile= No logs wrong path or permission use -E ~/.ssh/sshd.log 6. Why this works User-level sshd does not need root because it binds to ports \u0026gt;= 1024. Public key auth avoids /etc/shadow access. Type=forking lets systemd track the daemon correctly. PIDFile helps systemd manage the process. 7. Notes Port \u0026gt; 1024: non-root cannot bind to privileged ports. Firewall: must allow your chosen port. Permissions: ~/.ssh must be 700 and authorized_keys must be 600. Multiple instances: use separate PidFile and log paths. Auto-start: systemctl --user enable sshd-user. 8. Best practices Use nohup for testing or temporary runs. Use systemd \u0026ndash;user for stable long-term service. Expose only key-based auth on public interfaces. Separate internal vs external ports. Use @reboot cron as fallback if systemd is unavailable. 9. Conclusion This guide showed how to deploy SSH without sudo:\nGenerate keys and enable key auth Create user-level sshd config Validate with nohup, then stabilize with systemd Fix start-limit-hit, port conflicts, and auth failures You get:\nMultiple ports and instances Auto-restart Auto-start Secure remote access References OpenSSH manual systemd user services OpenSSH key management Meta\nReading time: about 10 minutes Tags: SSH, Linux, systemd, nohup, no-sudo SEO keywords: no-sudo sshd systemd user OpenSSH start-limit-hit Meta description: A complete guide to running OpenSSH without sudo and fixing common startup errors. Call to Action (CTA) Try running a user-level sshd on your lab server. If this helps, share your setup and lessons learned.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/fix-sshsystem-process-start-failure/","summary":"\u003cp\u003e\u003cstrong\u003eTitle:\u003c/strong\u003e\nRun sshd Without sudo: Troubleshooting, nohup, and systemd (User-Level SSH)\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract:\u003c/strong\u003e\nHow to run OpenSSH as a normal user, solve common errors like \u0026ldquo;connection refused\u0026rdquo;, \u0026ldquo;password auth failed\u0026rdquo;, and \u003ccode\u003estart-limit-hit\u003c/code\u003e, and keep sshd alive using nohup or systemd.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTarget readers:\u003c/strong\u003e\nIntermediate Linux users, researchers on shared servers, and anyone who needs SSH without root.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"1-background--motivation\"\u003e1. Background / Motivation\u003c/h2\u003e\n\u003cp\u003eIn some lab or shared environments, regular users do not have sudo. The default sshd service cannot be started. If you need to:\u003c/p\u003e","title":"Run sshd Without sudo: Troubleshooting and Persistent User-Level SSH"},{"content":"Auto-start frp on Ubuntu: A Complete Guide Subtitle / Abstract Use systemd to run frp (Fast Reverse Proxy) as a managed service for stable, secure, and monitored auto-start on boot.\nReading time: 8 minutes Tags: frp, intranet tunneling, systemd, auto-start, Linux, Ubuntu SEO keywords: frp auto start, Ubuntu frp config, frpc systemd, frps service, intranet tunneling Meta description: Step-by-step systemd setup for frp (frpc/frps) with config templates and troubleshooting.\nTarget readers Developers deploying frps on cloud servers Intermediate Linux users building stable home/office tunnels DevOps and self-hosting enthusiasts Background and motivation Many developers use frp to expose internal services (SSH, web, NAS) to the internet. The problem is that running ./frpc -c frpc.ini manually is inconvenient and unreliable after reboot.\nWe want auto-start on boot + auto-restart on failure + centralized logs, which is exactly what systemd provides.\nCore concepts frps / frpc: server and client binaries for frp systemd: service manager for modern Linux unit file: configuration for service startup, dependencies, and restart policy Step-by-step setup 1) Install and place files sudo mv frpc /usr/local/bin/ sudo chmod +x /usr/local/bin/frpc sudo mkdir -p /etc/frp sudo mv frpc.ini /etc/frp/frpc.ini For the server side, replace frpc with frps and frpc.ini with frps.ini.\n2) (Optional) Create a dedicated user sudo useradd --system --no-create-home --shell /sbin/nologin frp sudo chown -R frp:frp /etc/frp 3) Create a systemd unit Create /etc/systemd/system/frpc.service:\n[Unit] Description=frp client service After=network-online.target Wants=network-online.target [Service] Type=simple User=frp Group=frp ExecStart=/usr/local/bin/frpc -c /etc/frp/frpc.ini Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target 4) Start and enable sudo systemctl daemon-reload sudo systemctl start frpc sudo systemctl enable frpc 5) Check status and logs sudo systemctl status frpc sudo journalctl -u frpc -f Logs are centralized in the systemd journal for easier troubleshooting.\nHow it works WantedBy=multi-user.target ensures auto-start during boot. After=network-online.target starts only after the network is ready. Restart=on-failure auto-restarts frpc on unexpected exit. Compared to @reboot cron, systemd gives better dependency control, restart policy, and unified logs.\nCommon issues and fixes Issue Cause Fix Service fails to start Config file permission issue Ensure /etc/frp/frpc.ini is readable by user frp Network not ready Missing systemd dependencies Enable systemd-networkd-wait-online.service frp cannot connect Firewall or security group blocks Open TCP/UDP ports Service not auto-starting enable not run sudo systemctl enable frpc Best practices Run with non-root user for safety. Ship logs to ELK/Promtail if needed. Enable token auth or TLS in frp configs. For multiple frpc instances, use frpc@name.service templates. Summary You learned how to:\nInstall and configure frp Create a systemd service Enable auto-start and auto-restart Understand common pitfalls Once you understand systemd, you can manage any custom daemon the same way.\nReferences frp docs: https://github.com/fatedier/frp systemd.service: https://www.freedesktop.org/software/systemd/man/systemd.service.html Ubuntu Server Guide - systemd: https://ubuntu.com/server/docs/service-systemd Call to Action Copy the unit file to your server and run sudo systemctl enable --now frpc. If it works, share what you are exposing via frp. You can also publish a template or script for others.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/frp-auto-start-on-ubuntu/","summary":"\u003ch1 id=\"auto-start-frp-on-ubuntu-a-complete-guide\"\u003eAuto-start frp on Ubuntu: A Complete Guide\u003c/h1\u003e\n\u003cp\u003e\u003cstrong\u003eSubtitle / Abstract\u003c/strong\u003e\nUse systemd to run frp (Fast Reverse Proxy) as a managed service for stable, secure, and monitored auto-start on boot.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eReading time\u003c/strong\u003e: 8 minutes\n\u003cstrong\u003eTags\u003c/strong\u003e: frp, intranet tunneling, systemd, auto-start, Linux, Ubuntu\n\u003cstrong\u003eSEO keywords\u003c/strong\u003e: frp auto start, Ubuntu frp config, frpc systemd, frps service, intranet tunneling\n\u003cstrong\u003eMeta description\u003c/strong\u003e: Step-by-step systemd setup for frp (frpc/frps) with config templates and troubleshooting.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"target-readers\"\u003eTarget readers\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eDevelopers deploying frps on cloud servers\u003c/li\u003e\n\u003cli\u003eIntermediate Linux users building stable home/office tunnels\u003c/li\u003e\n\u003cli\u003eDevOps and self-hosting enthusiasts\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"background-and-motivation\"\u003eBackground and motivation\u003c/h2\u003e\n\u003cp\u003eMany developers use \u003cstrong\u003efrp\u003c/strong\u003e to expose internal services (SSH, web, NAS) to the internet. The problem is that running \u003ccode\u003e./frpc -c frpc.ini\u003c/code\u003e manually is inconvenient and unreliable after reboot.\u003c/p\u003e","title":"Auto-start frp on Ubuntu with systemd"},{"content":"Windows + WSL2 Port Forwarding Guide (Access Flask 5000) Prerequisites You are using WSL2 (Ubuntu or another Linux distro) The Windows host can access the LAN (Wi-Fi or Ethernet) A Flask service is running inside WSL2 and listening on: app.run(host=\u0026#34;0.0.0.0\u0026#34;, port=5000) host=\u0026quot;0.0.0.0\u0026quot; is required; otherwise external access will fail.\nStep 1: Check the WSL2 IP In WSL2:\nip addr show eth0 You should see something like:\ninet 172.26.209.37/20 Record the IP after inet (here: 172.26.209.37). This is the WSL2 internal IP.\nStep 2: Open PowerShell (Admin) Press Win + X and select Windows PowerShell (Admin) Confirm admin privileges if prompted by UAC Step 3: Add Port Forwarding In PowerShell, forward Windows port 5000 to WSL2:\n# Forward Windows port 5000 to WSL2 port 5000 netsh interface portproxy add v4tov4 listenport=5000 listenaddress=0.0.0.0 connectport=5000 connectaddress=172.26.209.37 # Allow LAN access through the firewall netsh advfirewall firewall add rule name=\u0026#34;WSL Flask 5000\u0026#34; dir=in action=allow protocol=TCP localport=5000 listenaddress=0.0.0.0 listens on all Windows interfaces connectaddress=172.26.209.37 is the WSL2 internal IP The firewall rule allows LAN devices to access Windows port 5000 Step 4: Test the Forwarding On the Windows machine: curl http://localhost:5000 # or curl http://192.168.1.227:5000 From another LAN device: http://\u0026lt;Windows-LAN-IP\u0026gt;:5000 Example:\nhttp://192.168.1.227:5000 Step 5 (Optional): Auto-update Script WSL2 IP can change after reboot. You can create a PowerShell script wsl_port_forward.ps1 to update rules:\n# Get current WSL IP $wsl_ip = wsl hostname -I | ForEach-Object { $_.Split(\u0026#34; \u0026#34;)[0] } Write-Host \u0026#34;Detected WSL IP: $wsl_ip\u0026#34; # Remove old rule netsh interface portproxy delete v4tov4 listenport=5000 listenaddress=0.0.0.0 # Add new rule netsh interface portproxy add v4tov4 listenport=5000 listenaddress=0.0.0.0 connectport=5000 connectaddress=$wsl_ip # Allow firewall netsh advfirewall firewall add rule name=\u0026#34;WSL Flask 5000\u0026#34; dir=in action=allow protocol=TCP localport=5000 Run this script before starting WSL each time It detects the current WSL IP and updates the forwarding rule Step 6: Notes Flask must listen on 0.0.0.0, otherwise only local access works Ensure Windows Firewall allows TCP port 5000 If LAN devices still cannot access: Check router policies for blocked LAN ports Verify Windows firewall rules WSL2 uses NAT; LAN devices cannot reach the WSL IP directly. Use Windows IP + port forwarding Summary WSL2 networking is isolated by default; LAN cannot access WSL directly Windows port forwarding + firewall rules enable LAN access to WSL services An auto-update script can handle WSL IP changes after reboot ","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/wsl-intranet-not-shared-with-windows/","summary":"\u003ch1 id=\"windows--wsl2-port-forwarding-guide-access-flask-5000\"\u003eWindows + WSL2 Port Forwarding Guide (Access Flask 5000)\u003c/h1\u003e\n\u003ch2 id=\"prerequisites\"\u003ePrerequisites\u003c/h2\u003e\n\u003col\u003e\n\u003cli\u003eYou are using \u003cstrong\u003eWSL2\u003c/strong\u003e (Ubuntu or another Linux distro)\u003c/li\u003e\n\u003cli\u003eThe Windows host can access the LAN (Wi-Fi or Ethernet)\u003c/li\u003e\n\u003cli\u003eA Flask service is running inside WSL2 and listening on:\u003c/li\u003e\n\u003c/ol\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-python\" data-lang=\"python\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003eapp\u003cspan style=\"color:#f92672\"\u003e.\u003c/span\u003erun(host\u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e\u003cspan style=\"color:#e6db74\"\u003e\u0026#34;0.0.0.0\u0026#34;\u003c/span\u003e, port\u003cspan style=\"color:#f92672\"\u003e=\u003c/span\u003e\u003cspan style=\"color:#ae81ff\"\u003e5000\u003c/span\u003e)\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cblockquote\u003e\n\u003cp\u003e\u003ccode\u003ehost=\u0026quot;0.0.0.0\u0026quot;\u003c/code\u003e is required; otherwise external access will fail.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003chr\u003e\n\u003ch2 id=\"step-1-check-the-wsl2-ip\"\u003eStep 1: Check the WSL2 IP\u003c/h2\u003e\n\u003cp\u003eIn WSL2:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-bash\" data-lang=\"bash\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003eip addr show eth0\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eYou should see something like:\u003c/p\u003e\n\u003cpre tabindex=\"0\"\u003e\u003ccode\u003einet 172.26.209.37/20\n\u003c/code\u003e\u003c/pre\u003e\u003cblockquote\u003e\n\u003cp\u003eRecord the IP after \u003ccode\u003einet\u003c/code\u003e (here: \u003ccode\u003e172.26.209.37\u003c/code\u003e). This is the WSL2 internal IP.\u003c/p\u003e","title":"Expose WSL2 Services to the LAN via Windows Port Forwarding"},{"content":"Load Testing APIs with wrk (Detailed Guide) This article explains how to use wrk on Ubuntu to stress-test backend APIs (Flask, FastAPI, Spring Boot, etc.) and interpret the results.\n1. What is wrk? wrk is a modern, high-performance HTTP benchmarking tool written in C. Key features:\nHigh concurrency: thousands of concurrent connections Multi-threaded: uses multiple CPU cores Lua scripting: for custom headers, bodies, tokens Faster than Apache Benchmark (ab): lighter and more stable 2. Install wrk On Ubuntu/Debian:\nsudo apt update sudo apt install wrk -y Verify:\nwrk --version Expected output:\nwrk 4.2.0 [epoll] 3. Quick start Suppose your service is at:\nhttp://192.168.1.224:5000/api/tenders Run:\nwrk -t4 -c100 -d30s http://192.168.1.224:5000/api/tenders Parameters Flag Meaning -t4 4 threads (use multi-core CPU) -c100 100 concurrent connections -d30s 30 seconds duration last arg target URL 4. Sample output explained Example output:\nRunning 30s test @ http://192.168.1.224:5000/api/tenders 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 1.12s 248.83ms 1.99s 85.59% Req/Sec 22.88 14.29 90.00 77.73% 2452 requests in 30.09s, 27.02MB read Socket errors: connect 0, read 0, write 0, timeout 2 Requests/sec: 81.49 Transfer/sec: 0.90MB Metrics Metric Meaning Example Notes Latency Average response time 1.12s Slow if \u0026gt; 1s Req/Sec Requests per thread per second 22.88 Depends on thread count Requests/sec Total QPS 81.49 Throughput Transfer/sec Data per second 0.90MB Bandwidth usage Timeouts Timed-out requests 2 Indicates delays In general:\nExcellent: latency \u0026lt; 200ms OK: 200-800ms Slow: \u0026gt; 1s 5. Tips to improve concurrency 1) Use a production server (Flask example) Do not use app.run(). Use Gunicorn:\npip install gunicorn gunicorn -w 4 -b 0.0.0.0:5000 run:app -w 4: 4 worker processes (recommended 2 * CPU + 1) Improves concurrency and stability 2) Increase async throughput (I/O-bound APIs) gunicorn -w 4 -k gevent -b 0.0.0.0:5000 run:app -k gevent uses async workers to handle many waiting requests.\n3) Reduce response size Large responses consume network bandwidth. Suggestions:\nReturn only required fields Enable gzip (Nginx or Flask plugins) 6. Advanced: Lua scripts for custom requests Lua scripts can do:\nCustom headers and tokens POST JSON bodies Randomized parameters Example post.lua:\nwrk.method = \u0026#34;POST\u0026#34; wrk.body = \u0026#39;{\u0026#34;keyword\u0026#34;:\u0026#34;test\u0026#34;}\u0026#39; wrk.headers[\u0026#34;Content-Type\u0026#34;] = \u0026#34;application/json\u0026#34; Run:\nwrk -t4 -c100 -d30s -s post.lua http://127.0.0.1:5000/api/search ","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/wrk-load-testing-guide/","summary":"\u003ch1 id=\"load-testing-apis-with-wrk-detailed-guide\"\u003eLoad Testing APIs with wrk (Detailed Guide)\u003c/h1\u003e\n\u003cblockquote\u003e\n\u003cp\u003eThis article explains how to use \u003ccode\u003ewrk\u003c/code\u003e on Ubuntu to stress-test backend APIs (Flask, FastAPI, Spring Boot, etc.) and interpret the results.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003chr\u003e\n\u003ch2 id=\"1-what-is-wrk\"\u003e1. What is wrk?\u003c/h2\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/wg/wrk\"\u003e\u003ccode\u003ewrk\u003c/code\u003e\u003c/a\u003e is a modern, high-performance HTTP benchmarking tool written in C. Key features:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eHigh concurrency\u003c/strong\u003e: thousands of concurrent connections\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMulti-threaded\u003c/strong\u003e: uses multiple CPU cores\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLua scripting\u003c/strong\u003e: for custom headers, bodies, tokens\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eFaster than Apache Benchmark (ab)\u003c/strong\u003e: lighter and more stable\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"2-install-wrk\"\u003e2. Install wrk\u003c/h2\u003e\n\u003cp\u003eOn Ubuntu/Debian:\u003c/p\u003e","title":"How to Use wrk for Load Testing"},{"content":"Access a Git Bare Repo on Windows WSL2 from the LAN In development, you often need to share Git repositories across multiple machines. If you use WSL2 on Windows and want other LAN machines to access a Git bare repo inside WSL2, this guide walks you through the setup.\n1. Create a Git bare repo in WSL2 In WSL2, go to the target directory:\ngit init --bare my_project.git my_project.git is a bare repo with no working tree, only Git data. A bare repo behaves like a remote and can be cloned and pushed. 2. Enable SSH in WSL2 Other machines will access via SSH.\nInstall SSH server: sudo apt update sudo apt install openssh-server -y Start SSH: sudo service ssh start Check status: sudo service ssh status Default port is 22; you can change it in /etc/ssh/sshd_config. 3. Get the WSL2 IP In WSL2:\nip addr Find the inet under eth0, for example:\ninet 172.25.190.21/20 Note: WSL2 IP can change after reboot.\n4. Configure Windows Firewall Allow SSH port through firewall:\nWindows Firewall -\u0026gt; Advanced settings -\u0026gt; Inbound rules -\u0026gt; New rule Rule type: Port -\u0026gt; TCP -\u0026gt; Port 22 (or custom like 2222) Allow connection -\u0026gt; Apply to Domain/Private/Public Name the rule and finish 5. Recommended: Windows port forwarding Because WSL2 IP changes, use Windows port forwarding:\nOpen PowerShell (Admin): netsh interface portproxy add v4tov4 listenport=2222 listenaddress=0.0.0.0 connectport=22 connectaddress=\u0026lt;WSL_IP\u0026gt; From another LAN machine, access via Windows IP + 2222: git clone ssh://user@WINDOWS_IP:2222/home/user/my_project.git user is your WSL2 username WINDOWS_IP is the Windows host LAN IP 6. Clone, push, pull from another machine Clone:\ngit clone ssh://user@WINDOWS_IP:2222/home/user/my_project.git Commit and push:\ngit add . git commit -m \u0026#34;update\u0026#34; git push origin main # or master Pull updates:\ngit pull origin main 7. Summary WSL2 has a virtual network; its IP may change on each boot. Port forwarding + firewall rules are the most reliable solution. A bare repo inside WSL2 works like a remote for LAN access. With these steps, multiple LAN machines can access a WSL2 Git repo for easy collaboration.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/notes/git-notes/lan-git-bare-repo/","summary":"\u003ch1 id=\"access-a-git-bare-repo-on-windows-wsl2-from-the-lan\"\u003eAccess a Git Bare Repo on Windows WSL2 from the LAN\u003c/h1\u003e\n\u003cp\u003eIn development, you often need to share Git repositories across multiple machines. If you use WSL2 on Windows and want other LAN machines to access a Git bare repo inside WSL2, this guide walks you through the setup.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"1-create-a-git-bare-repo-in-wsl2\"\u003e1. Create a Git bare repo in WSL2\u003c/h2\u003e\n\u003cp\u003eIn WSL2, go to the target directory:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-bash\" data-lang=\"bash\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003egit init --bare my_project.git\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cul\u003e\n\u003cli\u003e\u003ccode\u003emy_project.git\u003c/code\u003e is a bare repo with no working tree, only Git data.\u003c/li\u003e\n\u003cli\u003eA bare repo behaves like a remote and can be cloned and pushed.\u003c/li\u003e\n\u003c/ul\u003e\n\u003chr\u003e\n\u003ch2 id=\"2-enable-ssh-in-wsl2\"\u003e2. Enable SSH in WSL2\u003c/h2\u003e\n\u003cp\u003eOther machines will access via SSH.\u003c/p\u003e","title":"LAN Git Bare on WSL2"},{"content":"Simplified Git Branch Workflow (Solo / Small Team) This workflow is a simplified version of Git Flow. It is suitable for personal projects or small teams: structured but not heavy.\n1. Main branch (long-lived) main Always stable and release-ready Production deployments come from here For small teams, main is usually enough; no need for develop.\n2. Feature development (feature branch) Naming: feature/\u0026lt;feature-name\u0026gt; Purpose: build new features, then merge back to main Examples:\nfeature/login-api feature/user-profile Flow:\n# create feature branch from main git checkout -b feature/login-api main # merge back to main when done git checkout main git merge feature/login-api git branch -d feature/login-api 3. Bug fixes (bugfix branch) Naming: bugfix/\u0026lt;issue-name\u0026gt; Purpose: fix bugs in dev/test Example:\nbugfix/fix-login-redirect Same flow as feature branch, merge back to main.\n4. Hotfixes (hotfix branch) Naming: hotfix/\u0026lt;issue-name\u0026gt; Purpose: urgent production fixes Example:\nhotfix/security-patch Flow:\ngit checkout -b hotfix/security-patch main # fix, commit git checkout main git merge hotfix/security-patch git branch -d hotfix/security-patch 5. Releases (tags) Use Git tags for release versions No dedicated release branch required Example:\ngit tag v1.0.0 git push origin v1.0.0 Minimal recommended rules Permanent branch: main Temporary branches: feature/..., bugfix/..., hotfix/... Use tags for releases, no separate release branch Branch lifecycle diagram gitGraph commit id: \u0026#34;Init main\u0026#34; branch feature/login-api commit id: \u0026#34;Build login API\u0026#34; checkout main merge feature/login-api id: \u0026#34;Merge feature\u0026#34; branch bugfix/fix-redirect commit id: \u0026#34;Fix login redirect\u0026#34; checkout main merge bugfix/fix-redirect id: \u0026#34;Merge bugfix\u0026#34; branch hotfix/security-patch commit id: \u0026#34;Emergency security patch\u0026#34; checkout main merge hotfix/security-patch id: \u0026#34;Merge hotfix\u0026#34; commit id: \u0026#34;Tag v1.0.0\u0026#34; ","permalink":"https://shio-chan-dev.github.io/jeanblog/notes/git-notes/git-branching-workflow/","summary":"\u003ch1 id=\"simplified-git-branch-workflow-solo--small-team\"\u003eSimplified Git Branch Workflow (Solo / Small Team)\u003c/h1\u003e\n\u003cp\u003eThis workflow is a simplified version of Git Flow. It is suitable for personal projects or small teams: structured but not heavy.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"1-main-branch-long-lived\"\u003e1. Main branch (long-lived)\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003e\u003ccode\u003emain\u003c/code\u003e\u003c/strong\u003e\n\u003cul\u003e\n\u003cli\u003eAlways stable and release-ready\u003c/li\u003e\n\u003cli\u003eProduction deployments come from here\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/li\u003e\n\u003c/ul\u003e\n\u003cblockquote\u003e\n\u003cp\u003eFor small teams, \u003ccode\u003emain\u003c/code\u003e is usually enough; no need for \u003ccode\u003edevelop\u003c/code\u003e.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003chr\u003e\n\u003ch2 id=\"2-feature-development-feature-branch\"\u003e2. Feature development (feature branch)\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eNaming: \u003ccode\u003efeature/\u0026lt;feature-name\u0026gt;\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003ePurpose: build new features, then merge back to \u003ccode\u003emain\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eExamples:\u003c/p\u003e","title":"Git Branch Workflow for Small Teams"},{"content":"Use a Local Git Bare Repo to Separate Dev and Test Environments In full-stack work, a common problem is how to isolate dev and test environments. Many people host on GitHub or GitLab, but private projects may not be suitable for public hosting.\nGit is distributed. You can set up a local bare repo as a remote to move code from dev -\u0026gt; test in one machine.\nWhat is a bare repository? A normal repo (git init) has a working tree + .git metadata and can be edited directly. A bare repo (git init --bare) has only Git data, no working tree. It is usually used as a remote. In short:\nDev repo: where you write code Bare repo: remote sync point with full history Test repo: clone from bare repo to simulate deployment Step 1: Create the bare repo Create a bare repo under a local directory (e.g., ~/.repos):\nmkdir -p ~/.repos cd ~/.repos git init --bare scrapy.git Now ~/.repos/scrapy.git is your local remote.\nStep 2: Add the local remote in your dev repo Assume your dev repo is ~/scrapy:\ncd ~/scrapy git remote add local ~/.repos/scrapy.git Check:\ngit remote -v Expected:\nlocal /home/gong/.repos/scrapy.git (fetch) local /home/gong/.repos/scrapy.git (push) Step 3: Push to the local remote Push main:\ngit push local main Your bare repo now contains all commits.\nStep 4: Clone in the test environment Assume your test environment is ~/test-env:\ncd ~/test-env git clone ~/.repos/scrapy.git You now have a clean copy for testing without affecting dev.\nNote on HEAD warning Sometimes you see:\nwarning: remote HEAD refers to nonexistent ref, unable to checkout This happens because a newly created bare repo has no default HEAD. Set it:\ncd ~/.repos/scrapy.git git symbolic-ref HEAD refs/heads/main Then clone again.\nStep 5: Sync workflow In dev (~/scrapy):\ngit add . git commit -m \u0026#34;feat: finish feature\u0026#34; git push local main In test (~/test-env/scrapy):\ngit pull Now you can sync dev -\u0026gt; test easily on one machine.\nSummary If you cannot push to GitHub/GitLab, a local bare repo can separate dev and test:\nno external platform required dev and test isolated full Git history preserved If the project grows, consider a private Git service (Gitea/GitLab CE) or Docker deployment.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/notes/git-notes/git-bare-repo-dev-test-isolation/","summary":"\u003ch1 id=\"use-a-local-git-bare-repo-to-separate-dev-and-test-environments\"\u003eUse a Local Git Bare Repo to Separate Dev and Test Environments\u003c/h1\u003e\n\u003cp\u003eIn full-stack work, a common problem is \u003cstrong\u003ehow to isolate dev and test environments\u003c/strong\u003e. Many people host on GitHub or GitLab, but private projects may not be suitable for public hosting.\u003c/p\u003e\n\u003cp\u003eGit is distributed. You can set up a \u003cstrong\u003elocal bare repo\u003c/strong\u003e as a remote to move code from \u003cstrong\u003edev -\u0026gt; test\u003c/strong\u003e in one machine.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"what-is-a-bare-repository\"\u003eWhat is a bare repository?\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eA normal repo (\u003ccode\u003egit init\u003c/code\u003e) has a \u003cstrong\u003eworking tree + .git metadata\u003c/strong\u003e and can be edited directly.\u003c/li\u003e\n\u003cli\u003eA bare repo (\u003ccode\u003egit init --bare\u003c/code\u003e) has only Git data, no working tree. It is usually used as a \u003cstrong\u003eremote\u003c/strong\u003e.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eIn short:\u003c/p\u003e","title":"Use a Local Git Bare Repo to Separate Dev and Test Environments"},{"content":"Introduction For TypeScript files with the .ts extension, we cannot run them directly. We need to transpile TypeScript to JavaScript and then run the JavaScript output.\nThere are two common approaches: upload .ts to the server and compile via CI, or transpile locally and upload the .js build to production. If you want to run and test locally during development, you can use ts-node, but the project still needs a build step for production.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/dev/frontend/typescript-setup-guide/","summary":"\u003ch1 id=\"introduction\"\u003eIntroduction\u003c/h1\u003e\n\u003cp\u003eFor TypeScript files with the \u003ccode\u003e.ts\u003c/code\u003e extension, we cannot run them directly. We need to transpile TypeScript to JavaScript and then run the JavaScript output.\u003c/p\u003e\n\u003cp\u003eThere are two common approaches: upload \u003ccode\u003e.ts\u003c/code\u003e to the server and compile via CI, or transpile locally and upload the \u003ccode\u003e.js\u003c/code\u003e build to production. If you want to run and test locally during development, you can use \u003ccode\u003ets-node\u003c/code\u003e, but the project still needs a build step for production.\u003c/p\u003e","title":"How to Use and Configure a TypeScript Environment"},{"content":"Introduction I want to build an AI system that supports tree-shaped or graph-shaped Q\u0026amp;A, instead of a traditional single-thread chat flow.\nExploration Open-source framework research flowise ","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/ai-assistant-frontend-rebuild-ideas/","summary":"\u003ch1 id=\"introduction\"\u003eIntroduction\u003c/h1\u003e\n\u003cp\u003eI want to build an AI system that supports tree-shaped or graph-shaped Q\u0026amp;A, instead of a traditional single-thread chat flow.\u003c/p\u003e\n\u003ch1 id=\"exploration\"\u003eExploration\u003c/h1\u003e\n\u003ch2 id=\"open-source-framework-research\"\u003eOpen-source framework research\u003c/h2\u003e\n\u003ch3 id=\"flowise\"\u003eflowise\u003c/h3\u003e","title":"A New Frontend Idea for AI Assistants"},{"content":"Introduction Mermaid is a framework for creating diagrams using code. This post shows how to install the tooling on your server and render Mermaid code into images.\nSteps Install the renderer Run:\nnpm install -g @mermaid-js/mermaid-cli Note: the CLI requires npm version \u0026gt;= 20. It is recommended to manage npm versions with nvm.\nIf you do not have nvm, install it with:\ncurl -o https://raw.githubusercontent.com/nvm-sh/nvim/v0.39.4/install.sh | bash Restart your shell, then run:\nnvm install 20 nvm use 20 nvm alias default 20 Verify:\nnode -v npm -v Render a diagram Put your Mermaid code in a file ending with .mmd, then run:\nmmdc -i diagrams/example.mmd -o images/example.svg ","permalink":"https://shio-chan-dev.github.io/jeanblog/linux/linux/create-and-edit-mermaid-diagrams/","summary":"\u003ch1 id=\"introduction\"\u003eIntroduction\u003c/h1\u003e\n\u003cp\u003eMermaid is a framework for creating diagrams using code. This post shows how to install the tooling on your server and render Mermaid code into images.\u003c/p\u003e\n\u003ch1 id=\"steps\"\u003eSteps\u003c/h1\u003e\n\u003ch2 id=\"install-the-renderer\"\u003eInstall the renderer\u003c/h2\u003e\n\u003cp\u003eRun:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-bash\" data-lang=\"bash\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003enpm install -g @mermaid-js/mermaid-cli\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eNote: the CLI requires npm version \u0026gt;= 20. It is recommended to manage npm versions with nvm.\u003c/p\u003e\n\u003cp\u003eIf you do not have nvm, install it with:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;\"\u003e\u003ccode class=\"language-bash\" data-lang=\"bash\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003ecurl -o https://raw.githubusercontent.com/nvm-sh/nvim/v0.39.4/install.sh | bash\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eRestart your shell, then run:\u003c/p\u003e","title":"How to Create and Edit Mermaid Diagrams"},{"content":"How to Truly Master a Paper Conclusion To truly master a paper, reading once is not enough. You need to decompose, verify, and reconstruct it, and then express the key points in your own words or implementation. The goal: explain the core contribution in 5 minutes, derive key formulas by hand, and reproduce a core experiment.\nPrinciples and background A paper is a compressed expression of a problem. It omits background, intuition, failed attempts, and many details. Mastery requires \u0026ldquo;decompressing\u0026rdquo; that information into your own knowledge network: assumptions, derivations, engineering steps, and the limits of the results. Only then can you judge when to use it and when not to.\nSteps Do not treat a paper as authority. Treat it as a claim you can test. Break the claims into verifiable assertions and test them. Mastery is not memorizing text, but turning it into a tool you can use. Real understanding requires action: derive, implement, compare, explain.\nPreparation and pre-read (30-60 minutes)\nRead title, abstract, conclusion, figures (skip details). Capture what problem it solves and what results it claims. Scan intro and contributions; list 3 key claims. Check references to see if you need to read prerequisites. Deep read (2-6 hours)\nRead methods/theory carefully. Hand-derive key formulas. Create a symbol table; write pseudocode for algorithms. Mark unclear points and create a question list. Decompose and reconstruct (half day to days)\nBreak the paper into: problem, assumptions, method, theorems, experiments, conclusions, limits. Write 2-3 sentences for each section in your own words. Implement a minimal runnable version of the algorithm. Implement and reproduce (hours to days)\nFocus on the part that best reflects the contribution. Debug on small synthetic data, then match paper settings. Suggested environments: Python + Jupyter/Colab, or C++/Rust for systems/perf. Common libraries: numpy/pandas/matplotlib/scikit-learn/torch/tensorflow. Map paper symbols to code variables in comments/docstrings. Plot and compare\nReproduce key plots (loss curves, error tables). If exact numbers are hard, verify trends. Add assertions/unit tests to confirm theory on synthetic cases. Digest and output\nWrite a one-page cheatsheet or short blog; aim for a 5-minute explanation. Create Anki cards for assumptions, theorem conditions, derivation steps. Explain to someone else or write a report. Tools (practical)\nReferences: Zotero / Mendeley Notes: Obsidian / Notion / org-mode Code and experiments: Git + Jupyter/Colab + Docker Text tools: pdftotext, pdfgrep, grep, ripgrep Common mistakes Mistake: only read, never do (no derivation or implementation). Fix: force yourself to implement or write pseudocode and derive key steps. Mistake: ignore assumptions and boundaries. Fix: list all assumptions and test violations. Mistake: equate code with the paper. Fix: read author code and compare with the paper; record differences. Mistake: chase exact numeric reproduction too early. Fix: verify trends first, then refine details. Mistake: accept formulas without checking steps. Fix: derive line by line and track missing lemmas. Verification checklist Explain the core contribution, use cases, and limits in 5 minutes. Derive key formulas or rewrite the proof by hand. Implement a minimal working example that matches paper trends. Answer: what assumptions are critical, and what failure modes exist? Apply the idea to a slightly different problem and observe results. ","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/mastering-paper/","summary":"\u003ch1 id=\"how-to-truly-master-a-paper\"\u003eHow to Truly Master a Paper\u003c/h1\u003e\n\u003ch1 id=\"conclusion\"\u003eConclusion\u003c/h1\u003e\n\u003cp\u003eTo truly master a paper, reading once is not enough. You need to decompose, verify, and reconstruct it, and then express the key points in your own words or implementation. The goal: explain the core contribution in 5 minutes, derive key formulas by hand, and reproduce a core experiment.\u003c/p\u003e\n\u003ch1 id=\"principles-and-background\"\u003ePrinciples and background\u003c/h1\u003e\n\u003cp\u003eA paper is a compressed expression of a problem. It omits background, intuition, failed attempts, and many details. Mastery requires \u0026ldquo;decompressing\u0026rdquo; that information into your own knowledge network: assumptions, derivations, engineering steps, and the limits of the results. Only then can you judge when to use it and when not to.\u003c/p\u003e","title":"Mastering a Paper"},{"content":"What problem does this paper solve, and what are the results? We know AI systems are expanding and can solve general tasks. But many AI agent applications today target small tasks. NVIDIA argues that small language models (SLMs) are capable, more suitable, and cheaper, and should be a main direction for future agents.\nThe paper discusses:\nWhat tasks current SLMs can handle Where general language ability matters The limits of SLMs as agents Conclusion: moving from LLMs to SLMs has advantages in both capability and cost.\n","permalink":"https://shio-chan-dev.github.io/jeanblog/thoughts/thoughts/reading-nvidia-small-models-paper/","summary":"\u003ch1 id=\"what-problem-does-this-paper-solve-and-what-are-the-results\"\u003eWhat problem does this paper solve, and what are the results?\u003c/h1\u003e\n\u003cp\u003eWe know AI systems are expanding and can solve general tasks. But many AI agent applications today target small tasks. NVIDIA argues that small language models (SLMs) are capable, more suitable, and cheaper, and should be a main direction for future agents.\u003c/p\u003e\n\u003cp\u003eThe paper discusses:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eWhat tasks current SLMs can handle\u003c/li\u003e\n\u003cli\u003eWhere general language ability matters\u003c/li\u003e\n\u003cli\u003eThe limits of SLMs as agents\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eConclusion: moving from LLMs to SLMs has advantages in both capability and cost.\u003c/p\u003e","title":"Reading an NVIDIA Paper on Small Language Models"}]