Set operations with ORDER BY in SQL - Time & Space Complexity
We want to understand how the time needed to run a SQL query with set operations and ordering changes as the data grows.
Specifically, how does combining results and sorting them affect performance?
Analyze the time complexity of the following code snippet.
SELECT column1 FROM tableA
UNION
SELECT column1 FROM tableB
ORDER BY column1;
This query combines unique values from two tables and then sorts the final list by column1.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning both tables to collect rows, then sorting the combined results.
- How many times: Each row in both tables is processed once; sorting compares rows multiple times depending on total rows.
As the number of rows in both tables grows, the time to scan grows linearly, but sorting takes more time as the combined data grows.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 20 scans + sorting 20 rows |
| 100 | About 200 scans + sorting 200 rows |
| 1000 | About 2000 scans + sorting 2000 rows |
Pattern observation: Scanning grows straight with input size, sorting grows faster but not as fast as scanning squared.
Time Complexity: O(n log n)
This means the time grows a bit faster than the number of rows because sorting takes more steps as data grows.
[X] Wrong: "The query runs in straight linear time because it just reads rows once."
[OK] Correct: Sorting the combined results requires extra comparisons, so the time grows faster than just reading rows.
Understanding how set operations and sorting affect query time helps you explain performance in real projects and shows you can think about data size impact.
"What if we replaced UNION with UNION ALL (which does not remove duplicates)? How would the time complexity change?"