Why grouping is needed in SQL - Performance Analysis
When we use grouping in SQL, we want to organize data into sets that share something in common.
We ask: How does the work grow when we group more data?
Analyze the time complexity of the following SQL query using GROUP BY.
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;
This query counts how many employees are in each department.
Look for repeated steps in the query.
- Primary operation: Scanning all employee rows and grouping them by department.
- How many times: Each employee row is checked once, then grouped.
As the number of employees grows, the query must look at each one to group them.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks and groupings |
| 100 | About 100 checks and groupings |
| 1000 | About 1000 checks and groupings |
Pattern observation: The work grows directly with the number of rows.
Time Complexity: O(n)
This means the time to group grows in a straight line with the number of rows.
[X] Wrong: "Grouping makes the query much slower than just reading data."
[OK] Correct: Grouping just looks at each row once, so it grows linearly, not much slower.
Understanding how grouping scales helps you explain how databases handle summaries efficiently.
"What if we added a WHERE filter before grouping? How would the time complexity change?"