SIMILAR TO for regex-lite matching in PostgreSQL - Time & Space Complexity
We want to understand how the time it takes to check a pattern using SIMILAR TO grows as the data gets bigger.
How does the matching time change when the text or pattern size increases?
Analyze the time complexity of the following code snippet.
SELECT *
FROM users
WHERE username SIMILAR TO '%(abc|def)%';
This query finds all usernames containing either "abc" or "def" using SIMILAR TO pattern matching.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Checking each username against the SIMILAR TO pattern.
- How many times: Once per row in the users table.
As the number of usernames grows, the database checks more rows one by one.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 pattern checks |
| 100 | 100 pattern checks |
| 1000 | 1000 pattern checks |
Pattern observation: The work grows directly with the number of rows; doubling rows doubles checks.
Time Complexity: O(n)
This means the time to run the query grows in a straight line with the number of rows to check.
[X] Wrong: "SIMILAR TO uses indexes so it always runs fast no matter how many rows."
[OK] Correct: SIMILAR TO patterns usually require checking each row because indexes don't support this kind of pattern well, so time grows with rows.
Understanding how pattern matching scales helps you explain query performance and shows you know how databases handle text searches.
"What if we replaced SIMILAR TO with a simple equality check? How would the time complexity change?"