Rabin Karp String Matching in DSA Python - Time & Space Complexity
We want to understand how the time needed to find a pattern in a text grows as the text and pattern get bigger.
How does the Rabin Karp method handle this search efficiently?
Analyze the time complexity of the following code snippet.
def rabin_karp(text, pattern):
n, m = len(text), len(pattern)
p_hash = hash(pattern)
for i in range(n - m + 1):
t_hash = hash(text[i:i+m])
if t_hash == p_hash and text[i:i+m] == pattern:
return i
return -1
This code tries to find the first place where the pattern appears in the text using hashing to speed up comparisons.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping through the text to check each substring of pattern length.
- How many times: Approximately n - m + 1 times, where n is text length and m is pattern length.
- Secondary operation: Computing hash for each substring and comparing hashes.
As the text grows, the number of substring checks grows roughly in a straight line with the text size.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks |
| 100 | About 100 checks |
| 1000 | About 1000 checks |
Pattern observation: The work grows linearly with the text size, assuming pattern size is much smaller.
Time Complexity: O(n * m) on average, O(n * m) in worst case
This means on average, the search grows roughly with the product of text and pattern sizes, and in worst case it can be slower.
[X] Wrong: "Rabin Karp always runs in linear time no matter what."
[OK] Correct: Sometimes many hash collisions happen, causing extra checks that slow it down to quadratic time.
Understanding Rabin Karp helps you see how hashing can speed up searching and when it might slow down, a useful skill for many coding problems.
"What if we used a rolling hash to avoid recomputing the hash from scratch each time? How would the time complexity change?"