0
0
HLDsystem_design~7 mins

Database normalization vs denormalization in HLD - Architecture Trade-offs

Choose your learning style9 modes available
Problem Statement
When data is stored without organization, it leads to redundancy and inconsistencies, causing errors during updates and inefficient storage. On the other hand, highly normalized databases can cause slow read performance due to many table joins, which hurts user experience in read-heavy systems.
Solution
Normalization organizes data into related tables to reduce duplication and maintain consistency by enforcing rules. Denormalization intentionally introduces some redundancy by combining tables or duplicating data to speed up read queries, trading off some consistency for performance.
Architecture
Customers
───────────────
CustomerID FK
CustomerOrders
───────────────────────────────

The diagram shows normalized tables with separate Customers and Orders linked by CustomerID, and a denormalized single table combining customer and order data to reduce joins.

Trade-offs
✓ Pros
Normalization reduces data duplication, saving storage and preventing update anomalies.
Normalized design enforces data integrity through foreign keys and constraints.
Denormalization improves read performance by reducing the number of joins needed.
Denormalization simplifies query logic for read-heavy workloads.
✗ Cons
Normalization can cause slower read queries due to multiple joins.
Denormalization increases storage needs and risks data inconsistency.
Denormalized data requires extra effort to keep duplicated data synchronized.
Use normalization for transactional systems with frequent writes and updates where data integrity is critical. Use denormalization for read-heavy systems like reporting or analytics where query speed is more important than strict consistency.
Avoid denormalization when write operations dominate and data consistency is critical. Avoid normalization in systems with very high read latency requirements and relatively static data.
Real World Examples
Amazon
Uses normalization in its order processing system to ensure accurate and consistent transaction data.
Netflix
Uses denormalization in its recommendation and analytics databases to speed up complex read queries.
Airbnb
Balances normalization for transactional data and denormalization for fast search and listing retrieval.
Alternatives
Data Warehousing with ETL
Separates transactional normalized databases from denormalized analytical databases using extract-transform-load processes.
Use when: When you need both strong transactional integrity and fast analytical queries without compromising either.
NoSQL Document Stores
Stores data in flexible, often denormalized document formats optimized for fast reads and horizontal scaling.
Use when: When schema flexibility and high read throughput are more important than strict relational integrity.
Summary
Normalization organizes data to reduce duplication and maintain consistency but can slow reads.
Denormalization adds redundancy to speed up read queries at the cost of extra storage and complexity.
Choosing between them depends on workload patterns and system priorities for consistency versus performance.