HLDsystem_design~25 mins

Database normalization vs denormalization in HLD - Design Approaches Compared

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: Database Design: Normalization vs Denormalization

Focus on database schema design choices between normalization and denormalization. Exclude specific database engine tuning or hardware scaling.

Functional Requirements

FR1: Design a database schema that supports efficient data storage and retrieval

FR2: Ensure data consistency and minimize redundancy

FR3: Support fast read queries for reporting and analytics

FR4: Allow easy updates and inserts without data anomalies

Non-Functional Requirements

NFR1: Handle up to 1 million records

NFR2: Read latency under 200ms for common queries

NFR3: Availability of 99.9% uptime

NFR4: Support concurrent read and write operations

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

Key Components

Relational database management system (RDBMS)

Tables and relationships (foreign keys)

Indexes and materialized views

Caching layers for read optimization

Design Patterns

Normalization forms (1NF, 2NF, 3NF)

Denormalization techniques (pre-joined tables, redundant columns)

Use of indexes and query optimization

Eventual consistency models for denormalized data

Reference Architecture

  +-------------------+       +-------------------+
  |   Normalized DB   |       |  Denormalized DB  |
  |  (3NF schema)     |       |  (redundant data) |
  +---------+---------+       +---------+---------+
            |                           |
            |                           |
     +------+-------+           +-------+-------+
     | Application  |           | Application   |
     |  Server      |           |  Server       |
     +--------------+           +---------------+

Components

Normalized Database

Relational DB (e.g., PostgreSQL, MySQL)

Store data in multiple related tables to reduce redundancy and maintain consistency

Denormalized Database

Relational DB or NoSQL

Store data with some redundancy to optimize read performance and simplify queries

Application Server

Any backend framework

Handles business logic, queries database, and serves client requests

Request Flow

1. Client sends query to application server

2. Application server decides to query normalized or denormalized schema based on query type

3. For normalized DB: joins multiple tables to fetch consistent data

4. For denormalized DB: fetches pre-joined or redundant data for faster reads

5. Application server processes data and returns response to client

6. For writes, application updates normalized tables and propagates changes to denormalized tables if used

Database Schema

Entities: - User (UserID PK, Name, Email) - Order (OrderID PK, UserID FK, OrderDate) - Product (ProductID PK, Name, Price) - OrderItem (OrderItemID PK, OrderID FK, ProductID FK, Quantity) Normalized relationships: - User 1:N Order - Order 1:N OrderItem - Product 1:N OrderItem Denormalized example: - OrderSummary (OrderID PK, UserName, OrderDate, TotalAmount) with redundant UserName and pre-calculated TotalAmount

Scaling Discussion

Bottlenecks

Normalized DB can have slow read queries due to multiple joins

Denormalized DB can have data inconsistency if updates are not properly synchronized

Write operations can be slower in denormalized DB due to multiple updates

Large data volume can increase query latency and storage costs

Solutions

Use caching layers (e.g., Redis) to speed up frequent read queries

Implement background jobs to update denormalized tables asynchronously

Use database partitioning and indexing to improve query performance

Adopt eventual consistency where strict consistency is not critical

Monitor and optimize slow queries regularly

Interview Tips

Time: 10 minutes to clarify requirements and constraints, 20 minutes to design schema and explain trade-offs, 10 minutes to discuss scaling and optimizations, 5 minutes for questions

Explain difference between normalization and denormalization clearly

Discuss trade-offs: consistency vs performance

Show understanding of query patterns and how they influence design

Mention real-world examples where each approach fits best

Highlight scaling challenges and practical solutions