Algorithms

String Algorithms

Difficulty: Advanced

Naive substring search compares the pattern against every position in the text, restarting from scratch on each mismatch, for O(n m) worst-case time. KMP, by precomputing where to resume after a mismatch, never restarts. The text pointer...

Learn
/
Algorithms
/

String Algorithms

0%
ADVANCED
Complexity varies

String Algorithms

Naive substring search compares the pattern against every position in the text, restarting from scratch on each mismatch, for O(n * m) worst-case time. KMP, by precomputing where to resume after a mismatch, never restarts. The text pointer moves forward strictly monotonically, the pattern pointer is corrected by a failure function, and the whole search runs in O(n + m). That single insight defines a family of linear-time string algorithms that power editors, search engines, and bioinformatics pipelines.

String Algorithms covers that family. KMP teaches you to build the failure function (prefix function) and use it to skip redundant comparisons. The Z algorithm offers an alternative O(n) framework based on the Z-array. Rabin-Karp introduces rolling hashes for multi-pattern search and is the backbone of plagiarism detection. Manacher's algorithm finds the longest palindromic substring in linear time using a clever transform with separators. Aho-Corasick generalizes KMP to multiple patterns simultaneously by adding failure links to a trie, producing a finite automaton that scans the text once.

In Arrays & Strings, you treated strings as arrays of characters with O(1) indexed access. Hash Map (Dictionary) Basics gave you the O(1) lookup used by Rabin-Karp. This lesson layers algorithmic structure on top of those primitives.

Next, Pattern Matching Algorithms extends the toolkit with Boyer-Moore, DFA-based matching, and 2D pattern search.

Advanced
70 min
6 Objectives
5 Sections

Topics:

Algorithms
Strings
String Matching
KMP Algorithm
Rabin-Karp Algorithm
Z-Algorithm
Rolling Hash
Palindrome
Advanced
Premium

What You'll Learn

By the end of this lesson, you will be able to:

Explain why naive string matching is O(nm) and how KMP improves it to O(n+m) using the failure function.

Build the KMP failure (prefix) function and implement the full KMP search algorithm in both JavaScript and Python.

Compute the Z-array and use it for pattern matching by concatenating pattern + separator + text.

Implement Rabin-Karp with a rolling hash function and explain its average-case O(n+m) and worst-case O(nm) behavior.

Describe Manacher's algorithm for longest palindromic substring and Aho-Corasick for multi-pattern matching at a conceptual level.

Compare KMP, Z, and Rabin-Karp to choose the right algorithm for a given problem.

Why This Matters

01

KMP eliminates redundant comparisons by precomputing where to resume after a mismatch, reducing naive O(nm) matching to O(n+m) — a foundational technique in text processing.

02

Rabin-Karp's rolling hash enables efficient multi-pattern search and powers plagiarism detection, DNA sequence matching, and substring search in large corpora.

03

The Z algorithm provides an elegant alternative to KMP with a simpler implementation, useful for pattern matching, string compression, and competitive programming.

04

Manacher's algorithm finds the longest palindromic substring in linear time, a classic hard interview problem that naive approaches solve in O(n^2) or O(n^3).

Key Terms

8 terms you'll encounter in this lesson

1

Failure Function (Prefix Function)

An array where fail[i] is the length of the longest proper prefix of pattern[0..i] that is also a suffix. KMP uses it to skip redundant comparisons after a mismatch.

2

Z-Array

For a string S, Z[i] is the length of the longest substring starting at index i that matches a prefix of S. Used for pattern matching by concatenating pattern$text.

3

Rolling Hash

A hash function that can be updated in O(1) when a window slides by one character, by removing the contribution of the outgoing character and adding the incoming one.

4

Rabin-Karp Algorithm

A string matching algorithm that uses rolling hashes to compare the pattern hash with each window hash in O(1), achieving O(n+m) average time.

5

KMP Algorithm

Knuth-Morris-Pratt algorithm: builds a failure function from the pattern, then scans the text without ever backtracking, achieving O(n+m) worst-case time.

6

Manacher's Algorithm

Finds the longest palindromic substring in O(n) by maintaining a center and right boundary, reusing previously computed palindrome radii to avoid redundant expansion.

7

Aho-Corasick Algorithm

Builds a trie of multiple patterns with failure links (like KMP on a trie), enabling simultaneous matching of all patterns in O(n + m + z) where z is total matches.

8

Spurious Hit

In Rabin-Karp, a hash collision where the hash matches but the actual strings differ. Resolved by a character-by-character comparison, contributing to worst-case O(nm).

Heads Up

Common misconceptions to watch out for

"KMP is always the best string matching algorithm"

KMP has O(n+m) worst-case, but Rabin-Karp is better for multi-pattern search (multiple hashes in parallel), and Z-algorithm is often simpler to implement in contests. The best choice depends on the problem.

"Rabin-Karp is O(n+m) guaranteed"

Rabin-Karp is O(n+m) on average, but hash collisions (spurious hits) can degrade it to O(nm) in the worst case. Using a large prime modulus and double hashing reduces collision probability.

"The failure function value at index i is the longest prefix equal to a suffix of the entire pattern"

The failure function at index i considers only pattern[0..i], not the full pattern. It is the longest proper prefix of pattern[0..i] that also equals a suffix of pattern[0..i].

"You need to learn all five string algorithms to solve problems"

KMP and Rabin-Karp cover most interview and practical use cases. Z-algorithm is a contest favorite. Manacher's and Aho-Corasick are specialized — learn them when you encounter specific problem types.