Why CTEs matter in PostgreSQL - Performance Analysis
We want to understand how using Common Table Expressions (CTEs) affects the time it takes for PostgreSQL to run queries.
Specifically, how does the size of data change the work done when CTEs are involved?
Analyze the time complexity of the following PostgreSQL query using a CTE.
WITH recent_orders AS (
SELECT * FROM orders WHERE order_date > CURRENT_DATE - INTERVAL '30 days'
)
SELECT customer_id, COUNT(*)
FROM recent_orders
GROUP BY customer_id;
This query first selects recent orders from the last 30 days, then counts how many orders each customer made.
Look for repeated work inside the query.
- Primary operation: Scanning the orders table to find recent orders.
- How many times: The CTE runs once, then the main query uses its result.
As the number of orders grows, the work to find recent orders grows too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Scan 10 rows, then group and count |
| 100 | Scan 100 rows, then group and count |
| 1000 | Scan 1000 rows, then group and count |
Pattern observation: The scanning and grouping work grows roughly in direct proportion to the number of rows scanned.
Time Complexity: O(n)
This means the time to run the query grows roughly in a straight line as the number of rows increases.
[X] Wrong: "CTEs always speed up queries because they break work into parts."
[OK] Correct: In PostgreSQL, CTEs act like a temporary result that is fully computed first, which can sometimes slow down queries if the data is large.
Understanding how CTEs affect query time helps you write better database queries and shows you know how databases handle work behind the scenes.
"What if we replaced the CTE with a subquery directly in the main query? How would the time complexity change?"