Problem Requirement
Input / Output
- Input:
nums, with1 <= nums.length <= 10 - Value range:
-10 <= nums[i] <= 10 numsmay contain duplicates- Output: return all unique subsets
- Ordering: the result order does not matter, but the same value sequence may appear only once
Example
Input: nums = [1,2,2]
Output: [[],[1],[1,2],[1,2,2],[2],[2,2]]
This tutorial starts with a correct but wasteful version, then derives sorting plus layer-level deduplication.
Start From Duplicate Branches in [1,2,2]
The smallest example that exposes the issue is:
nums = [1,2,2]
If we copy the 78. Subsets template directly, two different branches collide:
choose the 2 at index 1 -> [2]
choose the 2 at index 2 -> [2]
The branches use different indices, but the value sequence is the same.
So the new problem is not basic backtracking. The real question is:
How do we skip duplicate branches without deleting valid answers such as
[2,2]?
Step 1: Reuse the 78 State and Watch What Breaks
The partial answer still needs path; each layer still needs start so it only chooses to the right.
Start with the core version from 78. Subsets:
def dfs(start: int) -> None:
res.append(path.copy())
for i in range(start, len(nums)):
path.append(nums[i])
dfs(i + 1)
path.pop()
This version is correct when all values are distinct.
Now this version can:
- Enumerate every index combination.
- Keep
[2,2], because the two2s come from different indices.
It still lacks:
- A way to distinguish a valid repeated value from a duplicate branch at the same layer.
Step 2: First Build a Correct but Wasteful Version With seen
In the previous version, do not optimize yet. To make the answer correct, convert each path into a tuple and store it in seen.
Add seen:
seen: set[tuple[int, ...]] = set()
Replace the collection rule with:
state = tuple(path)
if state not in seen:
seen.add(state)
res.append(path.copy())
The first complete correct version is:
def subsets_with_dup(nums: list[int]) -> list[list[int]]:
res: list[list[int]] = []
path: list[int] = []
seen: set[tuple[int, ...]] = set()
def dfs(start: int) -> None:
state = tuple(path)
if state not in seen:
seen.add(state)
res.append(path.copy())
for i in range(start, len(nums)):
path.append(nums[i])
dfs(i + 1)
path.pop()
dfs(0)
return res
Now this version can:
- Return unique subsets.
- Preserve valid results such as
[2,2].
It still lacks:
- It still generates duplicate branches before throwing duplicates away.
- If input is
[2,1,2], equal values are not adjacent, so branch-level deduplication is hard.
Step 3: Sort First So Equal Values Become Adjacent
The bottleneck in the previous version is that duplicate branches are already built. To skip them before entering recursion, equal values must be next to each other.
In the previous version, add sorting before DFS:
nums.sort()
For [2,1,2], sorting gives:
[1,2,2]
Once equal values are adjacent, if the current layer already tried the first 2, it can recognize the second 2 as an equivalent same-layer branch.
Now this version can:
- Group equal values together.
- Keep correctness through
seen.
It still lacks:
- The actual rule that skips duplicate branches before they are built.
Step 4: Skip Only Duplicate Candidates in the Same Layer
Now replace “generate then deduplicate” with “skip before generating”.
The key condition is:
if i > start and nums[i] == nums[i - 1]:
continue
Why i > start?
i > startmeans this candidate is not the first candidate in the current layer.nums[i] == nums[i - 1]means an equal value has already opened a branch in this layer.
Why not i > 0?
Because deeper layers must still be allowed to choose the second 2 to form [2,2].
For [1,2,2]:
First layer:
i = 1 chooses the first 2 -> valid, produces [2]
i = 2 sees another 2 and i > start -> skip duplicate [2]
Inside the [2] branch:
start = 2
i = 2 chooses the second 2 -> i == start, do not skip, produce [2,2]
Now seen is no longer needed.
The final complete solution is:
class Solution:
def subsetsWithDup(self, nums: list[int]) -> list[list[int]]:
nums.sort()
res: list[list[int]] = []
path: list[int] = []
def dfs(start: int) -> None:
res.append(path.copy())
for i in range(start, len(nums)):
if i > start and nums[i] == nums[i - 1]:
continue
path.append(nums[i])
dfs(i + 1)
path.pop()
dfs(0)
return res
if __name__ == "__main__":
ans = Solution().subsetsWithDup([1, 2, 2])
print(ans)
Now this version can:
- Skip duplicate branches before entering recursion.
- Keep
[2,2]. - Avoid extra
seenstorage for answer deduplication.
It still lacks:
- A guaranteed output order; the problem does not require one.
Slow Branch Trace
After sorting, nums = [1,2,2].
At the first layer dfs(0):
collect []
i = 0, choose 1 -> path = [1]
i = 1, choose the first 2 -> path = [2]
i = 2, second 2 equals previous 2 and i > start, skip
Inside the [2] branch, at dfs(2):
start = 2
i = 2, here i == start, do not skip
choose the second 2 -> path = [2,2]
So the rule is not “duplicate values cannot be used”. The real rule is:
In the same layer, equal values open only one branch; deeper layers may still use later duplicates.
Correctness
Invariant:
- Sorting makes equal values adjacent.
- In the same recursion layer, if an equal value has already opened a branch, a later equal value would generate duplicate value sequences.
i > startrestricts the skip to the current layer only.
Why nothing is missed:
- Each value group still keeps the first candidate branch in every layer.
- If an answer needs multiple equal values such as
[2,2], the later equal value appears in a deeper layer wherei == start, so it is not skipped.
Why nothing is duplicated:
- Duplicate subsets come from same-layer equal-value branches.
- The skip rule removes all but the first such branch.
Complexity
- Sorting costs
O(n log n). - There can still be up to
2^nsubsets. - Copying collected paths costs
O(n * 2^n)overall. - The recursion stack and
pathuseO(n)extra space, excluding output.
Common Mistakes
- Writing
i > 0, which can incorrectly remove[2,2]. - Forgetting to sort, so equal values are not adjacent.
- Thinking duplicate values are forbidden; only duplicate same-layer branches are forbidden.
- Keeping both
seenand layer-level deduplication, which leaves two competing dedup mechanisms.
Summary
90. Subsets IIis78. Subsetsplus duplicate handling.- A
seenversion is correct but wastes search. - Sorting is the precondition for branch-level deduplication.
if i > start and nums[i] == nums[i - 1]means layer-level deduplication, not value-level deletion.
References
- LeetCode 78: Subsets
- LeetCode 90: Subsets II
- LeetCode 40: Combination Sum II
Notes
- Problem facts, examples, and constraints were taken from the existing repository draft for LeetCode 90.
- Python is used as the implementation language for the guided build.