GROUP BY with multiple columns in MySQL - Time & Space Complexity
When we use GROUP BY with multiple columns in SQL, we want to see how the time to run the query changes as the data grows.
We ask: How does grouping by more columns affect the work the database does?
Analyze the time complexity of the following code snippet.
SELECT department, role, COUNT(*) AS count
FROM employees
GROUP BY department, role;
This query groups employees by their department and role, then counts how many are in each group.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning all rows in the employees table once.
- How many times: Once for each row in the table (n times).
As the number of employees grows, the database must look at each row once to group them.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 row checks and group updates |
| 100 | About 100 row checks and group updates |
| 1000 | About 1000 row checks and group updates |
Pattern observation: The work grows roughly in direct proportion to the number of rows.
Time Complexity: O(n)
This means the time to run the query grows linearly with the number of rows in the table.
[X] Wrong: "Adding more columns to GROUP BY makes the query take much longer than just scanning the rows."
[OK] Correct: The main cost is scanning all rows once; grouping by more columns changes how groups are formed but does not multiply the number of rows scanned.
Understanding how grouping works helps you explain query performance clearly and shows you can think about how data size affects work done.
What if we added an ORDER BY clause after GROUP BY? How would the time complexity change?