Subqueries in FROM (derived tables) in PostgreSQL - Time & Space Complexity
When we use subqueries inside the FROM clause, the database runs these smaller queries first before the main query.
We want to understand how the total work grows as the data size grows.
Analyze the time complexity of the following code snippet.
SELECT dt.customer_id, dt.total_orders
FROM (
SELECT customer_id, COUNT(*) AS total_orders
FROM orders
GROUP BY customer_id
) AS dt
WHERE dt.total_orders > 5;
This query counts orders per customer in a subquery, then filters customers with more than 5 orders.
- Primary operation: Scanning all rows in the orders table to count orders per customer.
- How many times: Once over all orders; then a quick filter over the grouped results.
The database reads every order once to count them by customer, so if orders grow, work grows too.
| Input Size (n orders) | Approx. Operations |
|---|---|
| 10 | About 10 reads and counts |
| 100 | About 100 reads and counts |
| 1000 | About 1000 reads and counts |
Pattern observation: The work grows roughly in direct proportion to the number of orders.
Time Complexity: O(n)
This means the time to run the query grows roughly in a straight line as the number of orders increases.
[X] Wrong: "The subquery runs multiple times, so the time grows much faster than the data size."
[OK] Correct: The subquery runs once to prepare the data, not repeatedly, so time grows mainly with input size, not squared.
Understanding how subqueries inside FROM work helps you explain query performance clearly and shows you can think about data size effects.
"What if the subquery included a join with another large table? How would the time complexity change?"