Candidate key finding using closure in DBMS Theory - Time & Space Complexity
Finding candidate keys using closure involves checking attribute combinations to see if they determine all other attributes.
We want to understand how the time needed grows as the number of attributes increases.
Analyze the time complexity of the following pseudocode for candidate key finding using closure.
FOR each subset S of attributes in relation R
Compute closure of S
IF closure of S includes all attributes in R
Mark S as a candidate key
END IF
END FOR
This code tries every possible attribute subset, computes its closure, and checks if it covers all attributes.
Look for repeated steps that take most time.
- Primary operation: Computing closure for each subset of attributes.
- How many times: Once for every subset, which grows as the number of attributes increases.
As the number of attributes (n) grows, the number of subsets grows very fast.
| Input Size (n) | Approx. Operations |
|---|---|
| 3 | 8 subsets to check |
| 5 | 32 subsets to check |
| 10 | 1024 subsets to check |
Pattern observation: The number of subsets doubles with each added attribute, causing exponential growth.
Time Complexity: O(2^n * m)
This means the time grows exponentially with the number of attributes, where n is attributes count and m is number of functional dependencies.
[X] Wrong: "Checking only a few subsets is enough to find all candidate keys quickly."
[OK] Correct: Candidate keys can be any subset, so skipping subsets risks missing keys and leads to incorrect results.
Understanding this complexity helps you explain why candidate key discovery can be slow and shows your grasp of database fundamentals.
What if we limited the subsets to only those with a fixed size? How would the time complexity change?