| Users | Strong Consistency | Eventual Consistency |
|---|---|---|
| 100 users | Single primary DB node; synchronous replication; low latency acceptable | Simple replication; minor delays in data sync; easy to maintain |
| 10,000 users | Primary with multiple synchronous replicas; increased write latency; network delays visible | Asynchronous replication; replicas may lag; better write throughput |
| 1,000,000 users | High write latency; complex coordination; risk of bottlenecks on primary node | Multiple data centers; replicas updated asynchronously; stale reads possible but acceptable |
| 100,000,000 users | Very high coordination overhead; write throughput limited; complex consensus protocols needed | Highly distributed; eventual convergence guaranteed; system favors availability and partition tolerance |
Consistency models (strong, eventual) in HLD - Scalability & System Analysis
For strong consistency, the first bottleneck is the write latency and coordination overhead on the primary database node. As user count grows, synchronous replication and consensus protocols slow down writes.
For eventual consistency, the bottleneck is data synchronization delays across replicas, which can cause stale reads but does not block writes.
- Strong Consistency: Use distributed consensus algorithms (e.g., Paxos, Raft) optimized for latency; scale vertically with powerful nodes; limit write contention by sharding data.
- Eventual Consistency: Use asynchronous replication; deploy data centers closer to users; implement conflict resolution strategies; use caching to serve stale but fast data.
- Hybrid approaches: Use strong consistency for critical data and eventual consistency for less critical data to balance performance and correctness.
Assuming 1 million users with 10 requests per second each:
- Total requests: 10 million QPS.
- Strong consistency DB can handle ~5,000 QPS per node; would need ~2,000 nodes or heavy sharding.
- Eventual consistency allows writes to be distributed; each node handles async replication; fewer nodes needed.
- Network bandwidth: 10 million QPS * 1 KB/request = ~10 GB/s; requires high bandwidth infrastructure.
- Storage: depends on data size and retention; replication increases storage needs by factor of replicas.
Start by defining the consistency models clearly. Then discuss trade-offs between latency, availability, and correctness. Use real-world examples like banking (strong consistency) vs social media feeds (eventual consistency). Finally, explain how scaling affects each model and what solutions you would apply.
Your database handles 1000 QPS with strong consistency. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Introduce read replicas to offload read traffic and implement sharding to distribute write load. Also, consider asynchronous replication if some delay is acceptable to improve write throughput.