Cost-based optimization in DBMS Theory - Time & Space Complexity
Cost-based optimization helps a database decide the best way to run a query efficiently.
We want to know how the time to find the best plan grows as the query or data size grows.
Analyze the time complexity of this simplified cost-based optimizer process.
-- Pseudocode for cost-based optimization
FOR each possible query plan P
ESTIMATE cost of P
SELECT plan with lowest cost
This code tries all possible ways to run a query, estimates their costs, and picks the cheapest one.
Look for repeated steps that take most time.
- Primary operation: Checking each possible query plan.
- How many times: Number of plans grows with query complexity, often very fast.
As the query gets more parts (like joins), the number of plans to check grows quickly.
| Input Size (number of joins) | Approx. Plans Checked |
|---|---|
| 2 | 2 |
| 4 | ~20 |
| 6 | ~400 |
Pattern observation: The number of plans grows very fast, much faster than just doubling.
Time Complexity: O(n!)
This means the time to find the best plan grows extremely fast as the query size increases.
[X] Wrong: "The optimizer checks only a few plans, so it runs quickly for any query."
[OK] Correct: In reality, the number of plans can explode with query size, making optimization costly.
Understanding how optimization time grows helps you appreciate database design and query tuning skills.
What if the optimizer used heuristics to skip some plans? How would the time complexity change?