0
0
HLDsystem_design~10 mins

Relational database strengths in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Relational database strengths
Growth Table: Relational Database Scaling
Users / TrafficData SizeQuery LoadSystem Changes
100 usersSmall (MBs)Low (few QPS)Single DB instance, simple indexing
10,000 usersMedium (GBs)Moderate (hundreds QPS)Read replicas, connection pooling, optimized queries
1,000,000 usersLarge (TBs)High (thousands QPS)Sharding, caching layers, partitioning, load balancing
100,000,000 usersVery Large (PBs)Very High (tens of thousands QPS)Multi-region clusters, advanced sharding, distributed transactions, strong caching/CDN
First Bottleneck

At small to medium scale, the database CPU and disk I/O become the first bottleneck because relational databases rely on complex joins and ACID transactions that consume CPU and disk resources.

As traffic grows, connection limits and query latency increase, causing slow responses.

Scaling Solutions
  • Vertical scaling: Upgrade server CPU, RAM, and faster disks to handle more queries.
  • Read replicas: Offload read queries to replicas to reduce load on primary DB.
  • Connection pooling: Reuse DB connections to reduce overhead.
  • Caching: Use in-memory caches (e.g., Redis) for frequent queries to reduce DB hits.
  • Sharding: Split data horizontally across multiple DB instances by key ranges or user IDs.
  • Partitioning: Divide large tables into smaller parts to improve query performance.
  • Load balancing: Distribute traffic evenly across app servers and DB replicas.
  • Multi-region deployment: Place DBs closer to users to reduce latency.
Back-of-Envelope Cost Analysis

Assuming 1 million users with 1 request per second each:

  • Requests per second: ~1,000,000 QPS (too high for single DB)
  • Single PostgreSQL instance handles ~5,000 QPS → need ~200 DB instances or heavy caching
  • Storage: 1 TB data growing at 10 GB/day requires scalable storage solutions
  • Network bandwidth: 1 Gbps = 125 MB/s, so multiple network interfaces or cloud bandwidth needed
Interview Tip

Start by explaining the strengths of relational databases: strong consistency, ACID transactions, and complex queries.

Then discuss how these strengths impact scaling: joins and transactions add CPU and I/O load.

Outline bottlenecks clearly and propose solutions step-by-step: vertical scaling, read replicas, caching, sharding.

Use real numbers to show understanding of limits and trade-offs.

Self Check

Question: Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first and why?

Answer: Add read replicas to offload read queries and implement connection pooling to reduce overhead. This addresses the immediate CPU and connection bottlenecks without major redesign.

Key Result
Relational databases excel at consistency and complex queries but face CPU and I/O bottlenecks as traffic grows; scaling requires read replicas, caching, and sharding to maintain performance.