0
0
DSA Pythonprogramming

KMP Pattern Matching Algorithm in DSA Python

Choose your learning style9 modes available
Mental Model
KMP finds a pattern in text by remembering where to restart after a mismatch, so it never rechecks characters unnecessarily.
Analogy: Imagine searching a word in a book. Instead of starting over after a mismatch, you use a bookmark that tells you where to continue, saving time.
Text:  A B C D A B C D A B C
Pattern:      A B C D A B
           ↑
Prefix table: [0,0,0,0,1,2]
Dry Run Walkthrough
Input: text: 'ABABDABACDABABCABAB', pattern: 'ABABCABAB'
Goal: Find all starting indices where the pattern appears in the text efficiently.
Step 1: Build prefix table for pattern
Pattern: A B A B C A B A B
Prefix table: [0,0,1,2,0,1,2,3,4]
Why: Prefix table helps know how far to jump back on mismatch.
Step 2: Start matching pattern with text from index 0
Text: A B A B D A B A C D ...
Pattern: A B A B C A B A B
Match indices: 0 to 3 matched
Why: Match characters until mismatch or full pattern matched.
Step 3: Mismatch at text index 4, pattern index 4; use prefix table to shift pattern
Pattern index moves from 4 to prefix_table[3]=2
Text index stays at 4
Why: Avoid rechecking matched characters by jumping pattern index.
Step 4: Continue matching from pattern index 2 and text index 4
Further mismatches occur, matching resumes later from text index 5
Why: Resume matching from shifted pattern position.
Step 5: Full pattern matched at text index 10
Pattern found starting at index 10 in text
Why: We found one occurrence of the pattern.
Step 6: Continue searching for next matches after index 10
j reset to 4, but end of text reached; no more matches
Why: KMP continues to find all occurrences efficiently.
Result:
Pattern found at indices: 10
Annotated Code
DSA Python
class KMPMatcher:
    def __init__(self, pattern: str):
        self.pattern = pattern
        self.prefix_table = self.build_prefix_table(pattern)

    def build_prefix_table(self, pattern: str) -> list[int]:
        prefix = [0] * len(pattern)
        j = 0
        for i in range(1, len(pattern)):
            while j > 0 and pattern[i] != pattern[j]:
                j = prefix[j - 1]  # fallback in prefix table
            if pattern[i] == pattern[j]:
                j += 1
                prefix[i] = j
        return prefix

    def search(self, text: str) -> list[int]:
        result = []
        j = 0  # index for pattern
        for i in range(len(text)):
            while j > 0 and text[i] != self.pattern[j]:
                j = self.prefix_table[j - 1]  # fallback on mismatch
            if text[i] == self.pattern[j]:
                j += 1
            if j == len(self.pattern):
                result.append(i - j + 1)  # match found
                j = self.prefix_table[j - 1]  # continue searching
        return result


if __name__ == '__main__':
    text = 'ABABDABACDABABCABAB'
    pattern = 'ABABCABAB'
    matcher = KMPMatcher(pattern)
    matches = matcher.search(text)
    print('Pattern found at indices:', ', '.join(map(str, matches)))
while j > 0 and pattern[i] != pattern[j]: j = prefix[j - 1] # fallback in prefix table
fallback to previous longest prefix on mismatch while building prefix table
if pattern[i] == pattern[j]: j += 1 prefix[i] = j
extend current prefix length when characters match
while j > 0 and text[i] != self.pattern[j]: j = self.prefix_table[j - 1] # fallback on mismatch
fallback in pattern index on mismatch during search
if text[i] == self.pattern[j]: j += 1
advance pattern index on match
if j == len(self.pattern): result.append(i - j + 1) # match found j = self.prefix_table[j - 1] # continue searching
record match start index and reset pattern index to find next matches
OutputSuccess
Pattern found at indices: 10
Complexity Analysis
Time: O(n + m) because building the prefix table takes O(m) and searching takes O(n), where n is text length and m is pattern length
Space: O(m) for storing the prefix table of the pattern
vs Alternative: Naive search is O(n*m) because it restarts from scratch on mismatch; KMP avoids rechecking by using prefix table, making it faster
Edge Cases
Empty pattern
Returns empty list immediately, no matches
DSA Python
if j == len(self.pattern):
Pattern longer than text
No matches found, returns empty list
DSA Python
for i in range(len(text)):
Pattern equals text
Returns [0] as match at start
DSA Python
if j == len(self.pattern):
When to Use This Pattern
When you need to find a pattern inside a text efficiently and want to avoid rechecking characters after mismatches, use KMP pattern matching because it uses a prefix table to jump smartly.
Common Mistakes
Mistake: Not updating the pattern index j correctly after a mismatch, causing infinite loops or missed matches
Fix: Always fallback j to prefix_table[j-1] on mismatch inside the while loop
Mistake: Building prefix table incorrectly by not handling fallback properly
Fix: Use a while loop to fallback j until characters match or j is zero
Summary
KMP finds all occurrences of a pattern in a text efficiently by using a prefix table to skip unnecessary comparisons.
Use KMP when you want fast pattern searching especially for large texts and repeated patterns.
The key insight is the prefix table that tells how far to jump back in the pattern on mismatch, avoiding rechecking characters.