Rabin Karp String Matching in DSA C - Time & Space Complexity
We want to understand how the time taken by Rabin Karp string matching changes as the text and pattern sizes grow.
Specifically, how many steps does it take to find a pattern inside a text?
Analyze the time complexity of the following Rabin Karp implementation.
int rabinKarp(char *text, char *pattern) {
int n = strlen(text);
int m = strlen(pattern);
int pHash = 0, tHash = 0, h = 1;
int d = 256, q = 101; // base and prime
for (int i = 0; i < m - 1; i++)
h = (h * d) % q;
for (int i = 0; i < m; i++) {
pHash = (d * pHash + pattern[i]) % q;
tHash = (d * tHash + text[i]) % q;
}
for (int i = 0; i <= n - m; i++) {
if (pHash == tHash) {
int j;
for (j = 0; j < m; j++) {
if (text[i + j] != pattern[j])
break;
}
if (j == m)
return i; // match found
}
if (i < n - m) {
tHash = (d * (tHash - text[i] * h) + text[i + m]) % q;
if (tHash < 0) tHash += q;
}
}
return -1; // no match
}
This code searches for a pattern inside a text using hashing to speed up comparisons.
Look at the loops and repeated steps:
- Primary operation: The main loop slides over the text from start to end, comparing hashes and sometimes characters.
- How many times: The main loop runs about (n - m + 1) times, where n is text length and m is pattern length.
- Inside the main loop, if hashes match, a character-by-character check runs up to m times.
As the text size grows, the number of hash comparisons grows roughly linearly.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 hash checks, few character checks |
| 100 | About 100 hash checks, few character checks |
| 1000 | About 1000 hash checks, few character checks |
Pattern length m affects character checks only when hashes match, which is rare with a good hash.
So, the total steps grow roughly in a straight line as text size grows.
Time Complexity: O(n + m)
This means the time grows mostly in a straight line with the size of the text plus the pattern.
[X] Wrong: "Rabin Karp always checks every character in the text against the pattern."
[OK] Correct: Actually, it uses hash values to skip many character comparisons, so it usually runs faster than checking every character.
Understanding Rabin Karp's time complexity shows you how hashing helps speed up searching, a useful skill for many coding problems.
"What if we used a larger prime number for hashing? How would the time complexity change?"
