Cartesian product and joins in DBMS Theory - Time & Space Complexity
When working with database tables, it's important to know how the time to get results grows as tables get bigger.
We want to understand how the cost of combining tables changes with their size.
Analyze the time complexity of the following SQL query.
SELECT *
FROM TableA
CROSS JOIN TableB;
This query returns every possible pair of rows from TableA and TableB, called the Cartesian product.
Look at what repeats when the query runs.
- Primary operation: For each row in TableA, the database pairs it with every row in TableB.
- How many times: If TableA has n rows and TableB has m rows, this pairing happens n × m times.
As the number of rows grows, the total pairs grow much faster.
| Input Size (n rows in TableA, m rows in TableB) | Approx. Operations (pairs) |
|---|---|
| 10, 10 | 100 |
| 100, 100 | 10,000 |
| 1000, 1000 | 1,000,000 |
Pattern observation: Doubling the size of both tables causes the total pairs to grow by four times, showing a multiplication effect.
Time Complexity: O(n × m)
This means the time to get results grows proportionally to the product of the number of rows in both tables.
[X] Wrong: "The time grows only with the size of one table because we just scan one table at a time."
[OK] Correct: Because the query pairs every row from one table with every row from the other, both table sizes multiply the work, not just one.
Understanding how joins and Cartesian products scale helps you explain query performance clearly and shows you can think about database efficiency in real situations.
What if we changed the CROSS JOIN to an INNER JOIN with a condition that matches only some rows? How would the time complexity change?