0
0
HLDsystem_design~25 mins

Database normalization vs denormalization in HLD - Design Approaches Compared

Choose your learning style9 modes available
Design: Database Design: Normalization vs Denormalization
Focus on database schema design choices between normalization and denormalization. Exclude specific database engine tuning or hardware scaling.
Functional Requirements
FR1: Design a database schema that supports efficient data storage and retrieval
FR2: Ensure data consistency and minimize redundancy
FR3: Support fast read queries for reporting and analytics
FR4: Allow easy updates and inserts without data anomalies
Non-Functional Requirements
NFR1: Handle up to 1 million records
NFR2: Read latency under 200ms for common queries
NFR3: Availability of 99.9% uptime
NFR4: Support concurrent read and write operations
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Relational database management system (RDBMS)
Tables and relationships (foreign keys)
Indexes and materialized views
Caching layers for read optimization
Design Patterns
Normalization forms (1NF, 2NF, 3NF)
Denormalization techniques (pre-joined tables, redundant columns)
Use of indexes and query optimization
Eventual consistency models for denormalized data
Reference Architecture
  +-------------------+       +-------------------+
  |   Normalized DB   |       |  Denormalized DB  |
  |  (3NF schema)     |       |  (redundant data) |
  +---------+---------+       +---------+---------+
            |                           |
            |                           |
     +------+-------+           +-------+-------+
     | Application  |           | Application   |
     |  Server      |           |  Server       |
     +--------------+           +---------------+
Components
Normalized Database
Relational DB (e.g., PostgreSQL, MySQL)
Store data in multiple related tables to reduce redundancy and maintain consistency
Denormalized Database
Relational DB or NoSQL
Store data with some redundancy to optimize read performance and simplify queries
Application Server
Any backend framework
Handles business logic, queries database, and serves client requests
Request Flow
1. Client sends query to application server
2. Application server decides to query normalized or denormalized schema based on query type
3. For normalized DB: joins multiple tables to fetch consistent data
4. For denormalized DB: fetches pre-joined or redundant data for faster reads
5. Application server processes data and returns response to client
6. For writes, application updates normalized tables and propagates changes to denormalized tables if used
Database Schema
Entities: - User (UserID PK, Name, Email) - Order (OrderID PK, UserID FK, OrderDate) - Product (ProductID PK, Name, Price) - OrderItem (OrderItemID PK, OrderID FK, ProductID FK, Quantity) Normalized relationships: - User 1:N Order - Order 1:N OrderItem - Product 1:N OrderItem Denormalized example: - OrderSummary (OrderID PK, UserName, OrderDate, TotalAmount) with redundant UserName and pre-calculated TotalAmount
Scaling Discussion
Bottlenecks
Normalized DB can have slow read queries due to multiple joins
Denormalized DB can have data inconsistency if updates are not properly synchronized
Write operations can be slower in denormalized DB due to multiple updates
Large data volume can increase query latency and storage costs
Solutions
Use caching layers (e.g., Redis) to speed up frequent read queries
Implement background jobs to update denormalized tables asynchronously
Use database partitioning and indexing to improve query performance
Adopt eventual consistency where strict consistency is not critical
Monitor and optimize slow queries regularly
Interview Tips
Time: 10 minutes to clarify requirements and constraints, 20 minutes to design schema and explain trade-offs, 10 minutes to discuss scaling and optimizations, 5 minutes for questions
Explain difference between normalization and denormalization clearly
Discuss trade-offs: consistency vs performance
Show understanding of query patterns and how they influence design
Mention real-world examples where each approach fits best
Highlight scaling challenges and practical solutions