Lookahead and lookbehind in PHP - Time & Space Complexity
We want to understand how the time it takes to run a PHP regex with lookahead and lookbehind changes as the input grows.
How does the pattern matching cost grow when using these special checks?
Analyze the time complexity of the following code snippet.
$string = "abc123xyz456";
$pattern = '/(?<=\D)\d+(?=\D)/';
preg_match_all($pattern, $string, $matches);
// This finds numbers surrounded by non-digits.
This code uses lookbehind and lookahead to find numbers that have non-digit characters before and after them.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The regex engine scans the string character by character to check for matches.
- How many times: It checks each position in the string once, applying lookbehind and lookahead at each step.
As the string gets longer, the regex engine does more checks, roughly one per character.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks |
| 100 | About 100 checks |
| 1000 | About 1000 checks |
Pattern observation: The work grows roughly in direct proportion to the input size.
Time Complexity: O(n)
This means the time to find matches grows linearly as the string gets longer.
[X] Wrong: "Lookahead and lookbehind make regex run much slower, like exponentially slower."
[OK] Correct: While these checks add some work, they only look at fixed positions around each character, so the overall time still grows linearly with input size.
Understanding how regex features like lookahead and lookbehind affect performance helps you write efficient pattern matching in real projects and interviews.
"What if the lookbehind pattern was variable length instead of fixed length? How would the time complexity change?"