GROUP BY multiple columns in SQL - Time & Space Complexity
When we use GROUP BY with multiple columns, the database groups rows by combinations of those columns.
We want to know how the work grows as the table gets bigger.
Analyze the time complexity of the following code snippet.
SELECT department, role, COUNT(*)
FROM employees
GROUP BY department, role;
This query counts employees grouped by their department and role.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning all rows in the employees table once.
- How many times: Once for each row (n times).
As the number of rows grows, the database must look at each row to group it.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 row checks |
| 100 | About 100 row checks |
| 1000 | About 1000 row checks |
Pattern observation: The work grows directly with the number of rows.
Time Complexity: O(n)
This means the time to run the query grows in a straight line as the table gets bigger.
[X] Wrong: "Grouping by more columns makes the query take much longer than just one column."
[OK] Correct: The main work is still scanning each row once; grouping by more columns changes how groups form but does not multiply the scanning work.
Understanding how grouping scales helps you explain query performance clearly and shows you know how databases handle data as it grows.
"What if we added an ORDER BY after the GROUP BY? How would the time complexity change?"