HLDsystem_design~10 mins

Relational database strengths in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - Relational database strengths

Growth Table: Relational Database Scaling

Users / Traffic	Data Size	Query Load	System Changes
100 users	Small (MBs)	Low (few QPS)	Single DB instance, simple indexing
10,000 users	Medium (GBs)	Moderate (hundreds QPS)	Read replicas, connection pooling, optimized queries
1,000,000 users	Large (TBs)	High (thousands QPS)	Sharding, caching layers, partitioning, load balancing
100,000,000 users	Very Large (PBs)	Very High (tens of thousands QPS)	Multi-region clusters, advanced sharding, distributed transactions, strong caching/CDN

First Bottleneck

At small to medium scale, the database CPU and disk I/O become the first bottleneck because relational databases rely on complex joins and ACID transactions that consume CPU and disk resources.

As traffic grows, connection limits and query latency increase, causing slow responses.

Scaling Solutions

Vertical scaling: Upgrade server CPU, RAM, and faster disks to handle more queries.
Read replicas: Offload read queries to replicas to reduce load on primary DB.
Connection pooling: Reuse DB connections to reduce overhead.
Caching: Use in-memory caches (e.g., Redis) for frequent queries to reduce DB hits.
Sharding: Split data horizontally across multiple DB instances by key ranges or user IDs.
Partitioning: Divide large tables into smaller parts to improve query performance.
Load balancing: Distribute traffic evenly across app servers and DB replicas.
Multi-region deployment: Place DBs closer to users to reduce latency.

Back-of-Envelope Cost Analysis

Assuming 1 million users with 1 request per second each:

Requests per second: ~1,000,000 QPS (too high for single DB)
Single PostgreSQL instance handles ~5,000 QPS → need ~200 DB instances or heavy caching
Storage: 1 TB data growing at 10 GB/day requires scalable storage solutions
Network bandwidth: 1 Gbps = 125 MB/s, so multiple network interfaces or cloud bandwidth needed

Interview Tip

Start by explaining the strengths of relational databases: strong consistency, ACID transactions, and complex queries.

Then discuss how these strengths impact scaling: joins and transactions add CPU and I/O load.

Outline bottlenecks clearly and propose solutions step-by-step: vertical scaling, read replicas, caching, sharding.

Use real numbers to show understanding of limits and trade-offs.

Self Check

Question: Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first and why?

Answer: Add read replicas to offload read queries and implement connection pooling to reduce overhead. This addresses the immediate CPU and connection bottlenecks without major redesign.

Key Result

Relational databases excel at consistency and complex queries but face CPU and I/O bottlenecks as traffic grows; scaling requires read replicas, caching, and sharding to maintain performance.