Bird
0
0
DSA Cprogramming

KMP Pattern Matching Algorithm in DSA C

Choose your learning style9 modes available
Mental Model
KMP finds a pattern in text by remembering where to restart after a mismatch, avoiding repeated checks.
Analogy: Imagine searching a word in a book and using a bookmark to jump back to the right page instead of starting over from the beginning.
Text:  A B C D A B C D A B C D
Pattern:       A B C D
Prefix Table: [0,0,0,0]
Dry Run Walkthrough
Input: text: "ABABDABACDABABCABAB", pattern: "ABABCABAB"
Goal: Find all starting indices where the pattern appears in the text using KMP.
Step 1: Build prefix table for pattern
pattern: A B A B C A B A B
prefix table: [0,0,1,2,0,1,2,3,4]
Why: Prefix table tells where to restart pattern matching after mismatch
Step 2: Start matching pattern with text at index 0
text: [A] B A B D A B A C D A B A B C A B A B
pattern: [A] B A B C A B A B
Why: First characters match, continue checking next
Step 3: Mismatch at text index 4 and pattern index 4
text: A B A B [D] A B A C D A B A B C A B A B
pattern: A B A B [C] A B A B
restart pattern index to prefix_table[3]=2
Why: Mismatch found, use prefix table to avoid rechecking from start
Step 4: Continue matching from pattern index 2 and text index 4
text: A B A B D [A] B A C D A B A B C A B A B
pattern: A B [A] B C A B A B
Why: Restart pattern matching from index 2 to save time
Step 5: Find full pattern match starting at text index 10
text: A B A B D A B A C D [A B A B C A B A B]
pattern: A B A B C A B A B
Why: Pattern fully matched, record index 10
Result:
Match found at indices: 10
Annotated Code
DSA C
#include <stdio.h>
#include <string.h>

void buildPrefixTable(char* pattern, int* prefix, int m) {
    int len = 0; // length of previous longest prefix suffix
    prefix[0] = 0; // first value always 0
    int i = 1;
    while (i < m) {
        if (pattern[i] == pattern[len]) {
            len++;
            prefix[i] = len;
            i++;
        } else {
            if (len != 0) {
                len = prefix[len - 1]; // fallback in prefix
            } else {
                prefix[i] = 0;
                i++;
            }
        }
    }
}

void KMPSearch(char* text, char* pattern) {
    int n = strlen(text);
    int m = strlen(pattern);
    int prefix[m];
    buildPrefixTable(pattern, prefix, m);

    int i = 0; // index for text
    int j = 0; // index for pattern
    printf("Match found at indices: ");
    int found = 0;
    while (i < n) {
        if (pattern[j] == text[i]) {
            i++;
            j++;
        }
        if (j == m) {
            printf("%d ", i - j); // pattern found
            j = prefix[j - 1]; // continue searching
            found = 1;
        } else if (i < n && pattern[j] != text[i]) {
            if (j != 0) {
                j = prefix[j - 1]; // fallback in pattern
            } else {
                i++;
            }
        }
    }
    if (!found) {
        printf("-1");
    }
    printf("\n");
}

int main() {
    char text[] = "ABABDABACDABABCABAB";
    char pattern[] = "ABABCABAB";
    KMPSearch(text, pattern);
    return 0;
}
while (i < m) { if (pattern[i] == pattern[len]) { len++; prefix[i] = len; i++; } else { if (len != 0) { len = prefix[len - 1]; } else { prefix[i] = 0; i++; } } }
build prefix table to know fallback positions on mismatch
while (i < n) { if (pattern[j] == text[i]) { i++; j++; } if (j == m) { printf("%d ", i - j); j = prefix[j - 1]; } else if (i < n && pattern[j] != text[i]) { if (j != 0) { j = prefix[j - 1]; } else { i++; } } }
match pattern in text using prefix table to skip unnecessary checks
OutputSuccess
Match found at indices: 10
Complexity Analysis
Time: O(n + m) because we build prefix table in O(m) and scan text in O(n) without backtracking
Space: O(m) for prefix table storing fallback positions for pattern
vs Alternative: Naive search is O(n*m) because it restarts from next text index on mismatch; KMP avoids this by using prefix table
Edge Cases
Empty pattern
No matches found, prints -1
DSA C
int m = strlen(pattern);
Pattern longer than text
No matches found, prints -1
DSA C
while (i < n) { ... }
Pattern equals text
Match found at index 0
DSA C
if (j == m) { printf("%d ", i - j); j = prefix[j - 1]; }
When to Use This Pattern
When you need to find a pattern inside a text efficiently and avoid repeated comparisons, use KMP because it remembers where to restart after mismatches.
Common Mistakes
Mistake: Not building the prefix table correctly, causing wrong fallback positions.
Fix: Carefully update prefix table using previous longest prefix suffix length and fallback when mismatch occurs.
Mistake: Not updating pattern index j correctly after a full match.
Fix: After a full match, reset j to prefix[j-1] to continue searching for overlapping matches.
Summary
KMP finds all occurrences of a pattern in a text efficiently by using a prefix table to skip unnecessary comparisons.
Use KMP when you want fast pattern searching without restarting from scratch after mismatches.
The key insight is the prefix table that tells where to restart matching in the pattern after a mismatch.