GROUP BY with ORDER BY in SQL - Time & Space Complexity
When using GROUP BY with ORDER BY in SQL, it's important to understand how the query's work grows as the data grows.
We want to know how the time to group and sort data changes when the table gets bigger.
Analyze the time complexity of the following code snippet.
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
ORDER BY employee_count DESC;
This query groups employees by their department and then orders the groups by the number of employees in each department, from largest to smallest.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning all rows to group them by department.
- How many times: Once over all rows to group, then sorting the groups based on counts.
As the number of employees grows, the database must look at each row to group them.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 to group, then sort groups (few groups) |
| 100 | About 100 to group, then sort groups (more groups) |
| 1000 | About 1000 to group, then sort groups (even more groups) |
Pattern observation: The grouping work grows roughly in direct proportion to the number of rows, and sorting depends on the number of groups, which is usually smaller.
Time Complexity: O(n + g log g)
This means the time grows linearly with the number of rows, and sorting the groups adds a smaller cost depending on how many groups there are.
[X] Wrong: "The ORDER BY sorting takes as long as scanning all rows."
[OK] Correct: Sorting happens only on the grouped results, which are usually much fewer than the total rows.
Understanding how grouping and sorting scale helps you explain query performance clearly and shows you can think about data size impact in real projects.
"What if we added a WHERE clause to filter rows before grouping? How would the time complexity change?"