Self join patterns in PostgreSQL - Time & Space Complexity
When we use a self join, we combine a table with itself to compare rows. Understanding how long this takes helps us know if the query will run fast or slow as data grows.
We want to find out how the work needed changes when the table gets bigger.
Analyze the time complexity of the following code snippet.
SELECT a.id, b.id
FROM employees a
JOIN employees b ON a.manager_id = b.id;
This query finds pairs of employees and their managers by joining the employees table to itself.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Comparing each row in the first table to rows in the second table to find matches.
- How many times: For each of the n rows, the database looks for matching rows in the same table, potentially checking many rows.
As the number of employees grows, the number of comparisons grows quickly because each row is checked against others.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 100 comparisons |
| 100 | About 10,000 comparisons |
| 1000 | About 1,000,000 comparisons |
Pattern observation: The work grows much faster than the number of rows; doubling rows roughly quadruples the work.
Time Complexity: O(n²)
This means if the table size doubles, the work needed to complete the join roughly quadruples.
[X] Wrong: "A self join only takes as long as a normal join, so it's always fast."
[OK] Correct: Because the table joins with itself, the number of comparisons can grow much faster, making the query slower as data grows.
Understanding how self joins scale helps you explain query performance clearly and shows you can think about how data size affects work done.
"What if we add an index on the manager_id column? How would the time complexity change?"