ICPC Competitive Programming

KMP Algorithm

Knuth-Morris-Pratt Algorithm for Pattern Matching

KMP Algorithm

Knuth-Morris-Pratt (KMP) is an algorithm for finding patterns in text with time complexity O(n+m)O(n + m)

Main Concept

KMP avoids redundant comparisons by using a Failure Function (or Prefix Function) that tells how much to shift the pattern when a match fails.

Failure Function (Prefix Function)

For pattern PP of length mm:

  • π[i]\pi[i] = length of the longest proper prefix of P[0..i]P[0..i] that is also a suffix

Example

Pattern: "ABABAC"

iP[0..i]pi[i]Explanation
0A0No proper prefix
1AB0A != B
2ABA1A = A
3ABAB2AB = AB
4ABABA3ABA = ABA
5ABABAC0No match

Implementation

Compute Prefix Function

vector<int> computePrefix(const string& pattern) {
    int m = pattern.size();
    vector<int> pi(m, 0);
    
    int k = 0;  // length of current matching prefix
    
    for (int i = 1; i < m; i++) {
        // If mismatch, fallback to shorter prefix
        while (k > 0 && pattern[k] != pattern[i]) {
            k = pi[k - 1];
        }
        
        // If match, extend prefix
        if (pattern[k] == pattern[i]) {
            k++;
        }
        
        pi[i] = k;
    }
    
    return pi;
}
vector<int> kmpSearch(const string& text, const string& pattern) {
    int n = text.size();
    int m = pattern.size();
    
    vector<int> pi = computePrefix(pattern);
    vector<int> matches;
    
    int j = 0;  // number of characters matched
    
    for (int i = 0; i < n; i++) {
        // If mismatch, use failure function
        while (j > 0 && pattern[j] != text[i]) {
            j = pi[j - 1];
        }
        
        // If match, increment j
        if (pattern[j] == text[i]) {
            j++;
        }
        
        // If entire pattern matched
        if (j == m) {
            matches.push_back(i - m + 1);
            j = pi[j - 1];  // prepare for next match
        }
    }
    
    return matches;
}

Visualization

Text:    A B A B A B A C A B A
Pattern: A B A B A C
         
Step 1: ABABA matches, C != B
        Use pi[4] = 3, shift pattern

Text:    A B A B A B A C A B A
Pattern:     A B A B A C
             
Step 2: ABAB matches, A != B  
        Use pi[3] = 2, shift pattern

Text:    A B A B A B A C A B A
Pattern:         A B A B A C
                 
Step 3: ABABAC matches! Found at index 2

Applications

1. Count Number of Occurrences

int countOccurrences(const string& text, const string& pattern) {
    return kmpSearch(text, pattern).size();
}

2. Find Period of String

Period is the shortest string ss such that pattern = sks^k (repeated kk times)

int findPeriod(const string& s) {
    vector<int> pi = computePrefix(s);
    int n = s.size();
    int period = n - pi[n - 1];
    
    // Verify if it's actually a period
    if (n % period == 0) {
        return period;
    }
    return n;  // No period shorter than itself
}

3. Find Shortest Palindrome (Prepend)

string shortestPalindrome(string s) {
    string rev = s;
    reverse(rev.begin(), rev.end());
    string combined = s + "#" + rev;
    
    vector<int> pi = computePrefix(combined);
    int toAdd = s.size() - pi.back();
    
    return rev.substr(0, toAdd) + s;
}

Complexity

OperationTimeSpace
Build prefix functionO(m)O(m)O(m)O(m)
SearchO(n)O(n)O(1)O(1)
TotalO(n+m)O(n + m)O(m)O(m)

Comparison with Other Algorithms

AlgorithmPreprocessingSearchBest For
NaiveO(1)O(1)O(nm)O(nm)Short patterns
KMPO(m)O(m)O(n)O(n)General purpose
Rabin-KarpO(m)O(m)O(n)O(n) averageMultiple patterns
Z-functionO(m)O(m)O(n)O(n)Similar to KMP

Tips

  1. Failure function is very useful - can solve many other problems
  2. Use sentinel character - when concatenating strings, add separator like #
  3. Watch out for 0-indexed vs 1-indexed - code above uses 0-indexed