Conditional aggregation pattern in SQL - Time & Space Complexity
We want to understand how the time to run a conditional aggregation query changes as the data grows.
Specifically, how does checking conditions inside aggregation affect performance?
Analyze the time complexity of the following code snippet.
SELECT
department_id,
COUNT(CASE WHEN status = 'active' THEN 1 END) AS active_count,
COUNT(CASE WHEN status = 'inactive' THEN 1 END) AS inactive_count
FROM employees
GROUP BY department_id;
This query counts active and inactive employees per department using conditional aggregation.
- Primary operation: Scanning each row in the employees table once.
- How many times: Once per row, checking conditions and updating counts.
As the number of employees grows, the query checks each row once, applying conditions and counting.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks and counts |
| 100 | About 100 checks and counts |
| 1000 | About 1000 checks and counts |
Pattern observation: The work grows directly with the number of rows; doubling rows doubles work.
Time Complexity: O(n)
This means the time to run the query grows linearly with the number of rows in the table.
[X] Wrong: "Conditional aggregation makes the query run much slower because it checks multiple conditions per row repeatedly."
[OK] Correct: The query still scans each row only once, checking conditions in a single pass, so the time grows linearly, not exponentially.
Understanding how conditional aggregation scales helps you explain query performance clearly and shows you can reason about data processing efficiently.
"What if we added a nested subquery inside the CASE statements? How would the time complexity change?"