Community Problem

Minimum Genetic Mutation

Difficulty: Medium

Transform a gene string to a target one mutation at a time, where each intermediate must be in the bank.

Minimum Genetic Mutation

Transform a gene string to a target one mutation at a time, where each intermediate must be in the bank.

MEDIUM
Free
graphs
bfs
strings
shortest-path
lucasmoreau

By @lucasmoreau

December 13, 2025

·

Updated May 20, 2026

592 views

19

4.4 (11)

I had this on a Google L4 phone screen, and the candidate flailed because they were trying to model the search as a tree of all 4^8 strings instead of treating the bank as a graph and BFS-ing from start to end. The algorithm collapses the moment you reframe "each mutation" as "each edge in a graph where nodes are gene strings and edges connect those that differ by one character". The catalog covers BFS on grids, but it skipped this string-state BFS twin.

Minimum Genetic Mutation

A gene string can be represented by an 8-character long string, with choices from 'A', 'C', 'G', and 'T'.

Suppose we need to investigate a mutation from a gene string startGene to a gene string endGene where one mutation is defined as one single character changed in the gene string.

For example, "AACCGGTT" --> "AACCGGTA" is one mutation.

There is also a gene bank that records all the valid gene mutations. A gene must be in bank to make it a valid gene string.

Given the two gene strings startGene and endGene and the gene bank, return the minimum number of mutations needed to mutate from startGene to endGene. If there is no such a mutation, return -1.

Note: the starting point is assumed to be valid, so it might not be included in the bank.

Examples

Example 1:

  • Input: startGene = "AACCGGTT", endGene = "AACCGGTA", bank = ["AACCGGTA"]
  • Output: 1
  • Explanation: One mutation transforms the start to the end and the result is in the bank.

Example 2:

  • Input: startGene = "AACCGGTT", endGene = "AAACGGTA", bank = ["AACCGGTA", "AACCGCTA", "AAACGGTA"]
  • Output: 2
  • Explanation: One BFS path: AACCGGTT -> AACCGGTA -> AAACGGTA. Two mutations.

Example 3:

  • Input: startGene = "AAAAACCC", endGene = "AACCCCCC", bank = ["AAAACCCC", "AAACCCCC", "AACCCCCC"]
  • Output: 3
  • Explanation: BFS finds AAAAACCC -> AAAACCCC -> AAACCCCC -> AACCCCCC. Three mutations.

Example 4:

  • Input: startGene = "AACCGGTT", endGene = "AACCGGTA", bank = []
  • Output: -1
  • Explanation: With no entries in the bank, no mutation is valid.

Constraints

  • 0 <= bank.length <= 10.
  • startGene.length == endGene.length == bank[i].length == 8.
  • startGene, endGene, and bank[i] consist of only the characters 'A', 'C', 'G', and 'T'.

Follow-up

The 4 * 8 = 32 possible single-character mutations from any gene make this BFS very tight: each level expands at most 32 candidates, so the algorithm runs in O(L * 32 * |bank|) where L is the number of mutations needed. For a length-8 alphabet-of-4 string, the state space is bounded at 4^8 = 65536, so even the worst case is fast.

Solution

Hints

0/4
Hint 1
Hint 2
Hint 3
Hint 4
All Problems