| Users | Traffic Characteristics | Database Load | Read Replica Usage | Notes |
|---|---|---|---|---|
| 100 users | Low read/write requests | Single primary DB handles all | Not needed | Simple setup, no replicas |
| 10,000 users | Moderate reads, writes increasing | Primary DB under moderate load | 1-2 read replicas for read scaling | Offload read queries to replicas |
| 1,000,000 users | High read volume, writes steady | Primary DB write bottleneck possible | Multiple read replicas distributed geographically | Use replicas for read-heavy traffic, reduce latency |
| 100,000,000 users | Very high read/write traffic | Primary DB write bottleneck, network limits | Sharded replicas, geo-distributed clusters | Advanced replication, caching, and sharding needed |
Read replicas in HLD - Scalability & System Analysis
At small to medium scale, the primary database becomes the first bottleneck because it handles all write operations and read queries if no replicas exist. As user count grows, the primary server's CPU, disk I/O, and network bandwidth limit its ability to serve all requests quickly.
Without read replicas, read-heavy workloads overwhelm the primary DB, causing slow responses and timeouts.
- Read Replicas: Create one or more read-only copies of the primary database to handle read queries, reducing load on the primary.
- Load Balancing: Distribute read requests evenly across replicas to maximize throughput.
- Geographical Distribution: Place replicas closer to users to reduce latency.
- Write Optimization: Keep writes on the primary; optimize with batching or queueing.
- Caching: Use in-memory caches (e.g., Redis) to reduce database reads further.
- Sharding: For very large scale, split data across multiple primary-replica clusters by key ranges.
- Assuming 10,000 users, each making 1 request per second: 10,000 QPS total.
- Primary DB can handle ~5,000 QPS; read replicas needed to handle remaining reads.
- Each read replica can handle ~5,000 QPS; 2 replicas cover 10,000 QPS reads.
- Storage: Replicas require same storage as primary; plan for data growth.
- Network bandwidth: 1 Gbps (~125 MB/s) can support ~10,000 QPS with small query sizes.
When discussing read replicas in an interview, start by explaining the primary database bottleneck under read-heavy workloads. Then describe how read replicas offload read queries to improve scalability and reduce latency. Mention trade-offs like replication lag and eventual consistency. Finally, discuss how to scale replicas horizontally and combine with caching and sharding for very large systems.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to offload read queries from the primary database. This reduces load on the primary and allows the system to handle increased read traffic without immediate changes to the primary server.