HLDsystem_design~10 mins

Read replicas in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - Read replicas

Growth Table: Read Replicas Scaling

Users	Traffic Characteristics	Database Load	Read Replica Usage	Notes
100 users	Low read/write requests	Single primary DB handles all	Not needed	Simple setup, no replicas
10,000 users	Moderate reads, writes increasing	Primary DB under moderate load	1-2 read replicas for read scaling	Offload read queries to replicas
1,000,000 users	High read volume, writes steady	Primary DB write bottleneck possible	Multiple read replicas distributed geographically	Use replicas for read-heavy traffic, reduce latency
100,000,000 users	Very high read/write traffic	Primary DB write bottleneck, network limits	Sharded replicas, geo-distributed clusters	Advanced replication, caching, and sharding needed

First Bottleneck

At small to medium scale, the primary database becomes the first bottleneck because it handles all write operations and read queries if no replicas exist. As user count grows, the primary server's CPU, disk I/O, and network bandwidth limit its ability to serve all requests quickly.

Without read replicas, read-heavy workloads overwhelm the primary DB, causing slow responses and timeouts.

Scaling Solutions

Read Replicas: Create one or more read-only copies of the primary database to handle read queries, reducing load on the primary.
Load Balancing: Distribute read requests evenly across replicas to maximize throughput.
Geographical Distribution: Place replicas closer to users to reduce latency.
Write Optimization: Keep writes on the primary; optimize with batching or queueing.
Caching: Use in-memory caches (e.g., Redis) to reduce database reads further.
Sharding: For very large scale, split data across multiple primary-replica clusters by key ranges.

Back-of-Envelope Cost Analysis

Assuming 10,000 users, each making 1 request per second: 10,000 QPS total.
Primary DB can handle ~5,000 QPS; read replicas needed to handle remaining reads.
Each read replica can handle ~5,000 QPS; 2 replicas cover 10,000 QPS reads.
Storage: Replicas require same storage as primary; plan for data growth.
Network bandwidth: 1 Gbps (~125 MB/s) can support ~10,000 QPS with small query sizes.

Interview Tip

When discussing read replicas in an interview, start by explaining the primary database bottleneck under read-heavy workloads. Then describe how read replicas offload read queries to improve scalability and reduce latency. Mention trade-offs like replication lag and eventual consistency. Finally, discuss how to scale replicas horizontally and combine with caching and sharding for very large systems.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas to offload read queries from the primary database. This reduces load on the primary and allows the system to handle increased read traffic without immediate changes to the primary server.

Key Result

Read replicas help scale read-heavy workloads by offloading read queries from the primary database, improving throughput and reducing latency. The primary database remains the write bottleneck and requires further strategies at very large scale.