Why normalization eliminates data anomalies in DBMS Theory - Performance Analysis
We want to understand how normalization affects the operations needed to manage data without errors.
Specifically, how does normalization reduce repeated or conflicting data actions?
Analyze the time complexity of updating data in a normalized vs unnormalized table.
-- Unnormalized table example
UPDATE Orders
SET CustomerAddress = 'New Address'
WHERE CustomerID = 123;
-- Normalized tables example
UPDATE Customers
SET Address = 'New Address'
WHERE CustomerID = 123;
This shows updating a customer's address in one big table versus a separate customer table.
Look at how many times the update must happen and where data repeats.
- Primary operation: Updating customer address data.
- How many times: In unnormalized data, update repeats for every order by the customer; in normalized data, update happens once.
As the number of orders grows, the unnormalized update must change many rows, but normalized update stays the same.
| Number of Orders (n) | Approx. Updates in Unnormalized |
|---|---|
| 10 | 10 updates |
| 100 | 100 updates |
| 1000 | 1000 updates |
Pattern observation: Unnormalized updates grow linearly with orders; normalized updates stay constant.
Time Complexity: O(n) for unnormalized, O(1) for normalized
This means normalization reduces repeated work by storing data once, so updates don't grow with data size.
[X] Wrong: "Normalization makes updates slower because it splits data into many tables."
[OK] Correct: Actually, normalization reduces repeated updates by storing data once, so fewer changes are needed even if tables increase.
Understanding how normalization affects update operations shows your grasp of efficient data management, a key skill in database design and real-world applications.
What if we denormalized some data back into one table? How would the time complexity of updates change?