Why relational algebra is the theoretical foundation in DBMS Theory - Performance Analysis
Relational algebra forms the base for how databases process queries. Understanding its time complexity helps us see how query execution scales as data grows.
We want to know how the cost of running relational algebra operations changes with input size.
Analyze the time complexity of a simple relational algebra operation.
SELECT * FROM Employees WHERE Department = 'Sales';
-- This corresponds to the selection operation in relational algebra:
Ī_{Department='Sales'}(Employees)
This operation filters rows from the Employees table where the Department is 'Sales'.
Look at what repeats when this operation runs.
- Primary operation: Checking each row's Department value.
- How many times: Once for every row in the Employees table.
As the number of rows grows, the operation checks more rows one by one.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 checks |
| 100 | 100 checks |
| 1000 | 1000 checks |
Pattern observation: The number of checks grows directly with the number of rows.
Time Complexity: O(n)
This means the time to complete the selection grows in a straight line as the table gets bigger.
[X] Wrong: "Relational algebra operations always take the same time no matter the data size."
[OK] Correct: Each operation processes data row by row, so more data means more work and longer time.
Knowing how relational algebra operations scale helps you explain database query performance clearly and confidently.
What if we added an index on the Department column? How would the time complexity change?