0
0
HLDsystem_design~10 mins

Read replicas in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Read replicas
Growth Table: Read Replicas Scaling
UsersTraffic CharacteristicsDatabase LoadRead Replica UsageNotes
100 usersLow read/write requestsSingle primary DB handles allNot neededSimple setup, no replicas
10,000 usersModerate reads, writes increasingPrimary DB under moderate load1-2 read replicas for read scalingOffload read queries to replicas
1,000,000 usersHigh read volume, writes steadyPrimary DB write bottleneck possibleMultiple read replicas distributed geographicallyUse replicas for read-heavy traffic, reduce latency
100,000,000 usersVery high read/write trafficPrimary DB write bottleneck, network limitsSharded replicas, geo-distributed clustersAdvanced replication, caching, and sharding needed
First Bottleneck

At small to medium scale, the primary database becomes the first bottleneck because it handles all write operations and read queries if no replicas exist. As user count grows, the primary server's CPU, disk I/O, and network bandwidth limit its ability to serve all requests quickly.

Without read replicas, read-heavy workloads overwhelm the primary DB, causing slow responses and timeouts.

Scaling Solutions
  • Read Replicas: Create one or more read-only copies of the primary database to handle read queries, reducing load on the primary.
  • Load Balancing: Distribute read requests evenly across replicas to maximize throughput.
  • Geographical Distribution: Place replicas closer to users to reduce latency.
  • Write Optimization: Keep writes on the primary; optimize with batching or queueing.
  • Caching: Use in-memory caches (e.g., Redis) to reduce database reads further.
  • Sharding: For very large scale, split data across multiple primary-replica clusters by key ranges.
Back-of-Envelope Cost Analysis
  • Assuming 10,000 users, each making 1 request per second: 10,000 QPS total.
  • Primary DB can handle ~5,000 QPS; read replicas needed to handle remaining reads.
  • Each read replica can handle ~5,000 QPS; 2 replicas cover 10,000 QPS reads.
  • Storage: Replicas require same storage as primary; plan for data growth.
  • Network bandwidth: 1 Gbps (~125 MB/s) can support ~10,000 QPS with small query sizes.
Interview Tip

When discussing read replicas in an interview, start by explaining the primary database bottleneck under read-heavy workloads. Then describe how read replicas offload read queries to improve scalability and reduce latency. Mention trade-offs like replication lag and eventual consistency. Finally, discuss how to scale replicas horizontally and combine with caching and sharding for very large systems.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas to offload read queries from the primary database. This reduces load on the primary and allows the system to handle increased read traffic without immediate changes to the primary server.

Key Result
Read replicas help scale read-heavy workloads by offloading read queries from the primary database, improving throughput and reducing latency. The primary database remains the write bottleneck and requires further strategies at very large scale.