JOIN with aggregate functions in MySQL - Time & Space Complexity
When we use JOINs with aggregate functions in SQL, the database combines rows from two tables and then summarizes data.
We want to understand how the time to run this query grows as the tables get bigger.
Analyze the time complexity of the following code snippet.
SELECT c.customer_id, COUNT(o.order_id) AS total_orders
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id;
This query joins customers with their orders and counts how many orders each customer has.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: For each order, the database finds the matching customer and counts the order.
- How many times: This happens once per order.
As the number of customers and orders grows, the work to join and count grows too.
| Input Size (customers n, orders m) | Approx. Operations |
|---|---|
| 10 customers, 50 orders | About 50 matching checks and counts |
| 100 customers, 500 orders | About 500 matching checks and counts |
| 1000 customers, 5000 orders | About 5000 matching checks and counts |
Pattern observation: The total work grows roughly with the number of orders, since each order must be matched and counted.
Time Complexity: O(n + m)
This means the time grows roughly with the total number of customers plus orders, as the database processes both tables.
[X] Wrong: "The time depends only on the number of customers because we group by customer."
[OK] Correct: The database must check all orders to find matches, so orders affect the time too.
Understanding how JOINs with aggregates scale helps you explain query performance clearly and confidently in real situations.
"What if we added an index on the orders.customer_id column? How would the time complexity change?"