Finding gaps in sequences in SQL - Time & Space Complexity
When we look for missing numbers in a sequence stored in a database, we want to know how the time to find these gaps changes as the sequence grows.
We ask: How does the work increase when the list of numbers gets bigger?
Analyze the time complexity of the following code snippet.
WITH Numbered AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY id) AS rn
FROM sequence_table
)
SELECT id + 1 AS gap_start
FROM Numbered
WHERE id + 1 <> (SELECT id FROM Numbered WHERE rn = Numbered.rn + 1)
;
This code finds gaps in a sequence by comparing each number with the next expected number.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning each row in the sequence to compare with the next row.
- How many times: Once for each row in the sequence.
As the number of rows grows, the database checks each row once to find gaps.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 comparisons |
| 100 | About 100 comparisons |
| 1000 | About 1000 comparisons |
Pattern observation: The work grows roughly in direct proportion to the number of rows.
Time Complexity: O(n)
This means the time to find gaps grows linearly as the sequence gets longer.
[X] Wrong: "Finding gaps requires checking every possible number between the smallest and largest, so it must be much slower than just scanning the sequence."
[OK] Correct: The query only looks at existing rows once, comparing neighbors, so it does not check every missing number individually.
Understanding how to find gaps efficiently shows you can work with sequences and spot missing data quickly, a useful skill in many data tasks.
"What if we changed the query to find gaps in an unsorted sequence without using ROW_NUMBER()? How would the time complexity change?"