Most engineers I have mentored learn five backtracking problems by rote, then freeze on the sixth because they see it as a sixth distinct algorithm rather than the same algorithm with a different predicate. The template that solves N-queens also solves generate-permutations, also solves word-search, also solves sudoku, also solves combination-sum. The visible code looks different in each case because the problem-specific bits sit in different functions, but the spine never changes. Once I started teaching it as one shape with three holes (isComplete, getChoices, isValid), my interview prep stopped feeling like memorising a flashcard deck and started feeling like recognising the same friend in different clothes.
This article is the template, the canonical walk on N-queens, and a short tour of the variations. I am not going to enumerate 30 problems. The title is rhetorical. The point is that a well-formed template plus three or four problem-specific predicates covers a remarkable surface area, and once you internalise the shape you stop being surprised.
Here is the template, in JavaScript, as the first thing on the page so you can refer to it as you read.
That is the entire skeleton. Every backtracking problem I have ever solved is this loop with the four functions filled in differently. The rest of this article shows how.
Why the choose / explore / unchoose shape exists
The shape is dictated by one constraint: we are searching a tree of partial states, but we cannot afford to copy the state at every node because the branching factor would multiply our memory by the depth. So we mutate one shared state, recurse, and then undo the mutation on the way back up. That undo is the unchoose step, and it is the part beginners drop most often.
At each node we add one element, recurse, then remove it. The shared state is a single array that grows and shrinks as we walk the tree. The leaves are complete permutations, and we copy them into the result. If we forgot the unchoose step, the array at the root would be [1, 2, 3] after the first leaf and we would never explore [2, ...].
The unchoose is also why the time complexity feels so honest. We visit each node once, do O(1) work for the choose and unchoose (amortised), and only pay copy cost at the leaves where we record a snapshot. For a problem with n! permutations, the work is O(n * n!) and the space (excluding the result) is O(n) for the recursion depth.
The canonical walk: N-queens
N-queens asks you to place n queens on an n x n board so no two attack each other. I prefer it to permutations as the canonical because it makes the isValid predicate non-trivial and forces you to think about what state to mutate.
The template is intact. isComplete is row === n. getChoices is 0..n-1. isValid is the three-set check. The choose / unchoose pair is symmetric: every set we add to, we delete from; every push, we pop. Symmetry is the most reliable invariant when reading backtracking code. If the choose adds three things and the unchoose removes only two, you have a bug, full stop.
Two design decisions worth pointing out. First, I represent the board as placement[row] = col rather than as a 2D grid, which collapses the state. Second, I track attack lines as sets keyed by the invariants of the diagonals (r - c and r + c), not by recomputing attacks each step. This is the difference between an O(1) validity check and an O(n) one, and on n = 14 it is the difference between sub-second and 30 seconds.
Variation 1: subsets and combinations
Subsets of [1, 2, 3] are [], [1], [2], [3], [1,2], [1,3], [2,3], [1,2,3]. Eight outputs, 2^n in general. Every step of the recursion makes a binary choice for one element: include it or skip it. The template barely changes:
Note isComplete is now "every node" rather than "only leaves", because every partial subset is itself a valid subset. This is the only structural difference from N-queens, and it is one line.
Combinations of size k are the same loop with isComplete set to path.length === k. Combination-sum is the same loop with isComplete set to target === 0 and a sum tracker added. Combination-sum-with-repeats reuses the same start index instead of i + 1. All four are within five lines of each other; the template is the spine.
Variation 2: permutations
Permutations differ from subsets in that order matters, so the choices at each step are "any element we haven't used yet" rather than "any element after the current index".
The used array is the only addition. The choose / unchoose now flips a boolean alongside the path push / pop. Symmetry holds: two adds, two removes.
Variation 3: word search on a grid
Word-search asks if a word can be formed by adjacent cells on a grid (4-connected, no cell reused). The state is now a grid coordinate, and the choices are the four neighbours. The unchoose marks a cell as unvisited again on the way back up.
Notice the trick of marking the cell with # instead of carrying a separate visited grid. The choose step writes #, the unchoose step writes back the original character. One mutation, one inverse, no separate data structure. This is a small optimisation but it ports cleanly to maze-solving, island-counting, and any other grid-traversal-with-no-repeat problem. The pattern is the spine.
When the template stops being enough
For problems with rich constraint propagation (sudoku, certain scheduling problems), pure backtracking exhausts you. Sudoku has 9^81 raw states and even with the template it would not finish. Real sudoku solvers add constraint propagation: when you place a 7 in a row, every other empty cell in that row, column, and 3x3 box has its candidate set updated. The choose step now does both "place the digit" and "propagate constraints". The unchoose step now reverts both. This is still the same template; the choose just got smarter.
For problems where the search space is exponential and the optimal substructure is overlapping, backtracking is the wrong tool entirely; you want dynamic programming. The simplest signal is "am I solving the same subproblem twice along different branches?". If yes, memoise. If memoising changes the asymptotic complexity, you have crossed into DP territory and the template stops being enough.
Backtracking is not for problems with weighted optimisation either. "Find the cheapest path" is a graph algorithm (Dijkstra, A*), not a backtrack. Backtracking is for "enumerate all valid configurations" or "find one valid configuration" or "count valid configurations". The moment you introduce a cost function and ask for the minimum, the right tool changes shape.
Try the template once and the rest is muscle memory
The practical advice I give every junior I have mentored on this pattern is the same: write the N-queens code from scratch, twice, a week apart. Not because N-queens is special, but because the choose / unchoose symmetry is what your hands need to learn. After that, every other problem is the same template with three predicates filled in differently, and you will see the shape immediately when you read the problem statement. The titles change. The signature changes. The set of choices changes. The spine does not. That is the whole reason "backtracking" exists as a single named technique rather than as 30 unrelated algorithms, and once you trust the spine, the surface area collapses.
