HAVING for filtering groups in PostgreSQL - Time & Space Complexity
When we use HAVING in SQL, we filter groups after grouping data. Understanding how this affects time helps us know how query speed changes as data grows.
We want to see how the time to run a query with HAVING changes when the number of rows or groups increases.
Analyze the time complexity of the following code snippet.
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
This query groups employees by their department and then filters to keep only departments with more than 10 employees.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning all employee rows once to group them by department.
- How many times: Each employee row is processed once during grouping.
- Secondary operation: After grouping, filtering groups with HAVING condition is done once per group.
- How many times: Number of groups depends on distinct departments, usually much smaller than total rows.
As the number of employees grows, the time to scan and group them grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 row scans and grouping steps |
| 100 | About 100 row scans and grouping steps |
| 1000 | About 1000 row scans and grouping steps |
Filtering groups with HAVING happens after grouping, so its cost depends on the number of groups, which usually grows slower than total rows.
Overall, the main work grows roughly in direct proportion to the number of rows.
Time Complexity: O(n)
This means the time to run the query grows roughly in direct proportion to the number of rows in the table.
[X] Wrong: "HAVING filters rows one by one before grouping, so it reduces the work early."
[OK] Correct: HAVING filters after grouping, so all rows must be processed first to form groups before filtering happens.
Understanding how grouping and filtering with HAVING affects query time helps you explain database performance clearly and shows you know how data size impacts operations.
"What if we added an index on the department column? How would the time complexity change?"