| Users | Requests per Second (RPS) | Load Balancer Role | Server Count |
|---|---|---|---|
| 100 | 50 | Single server, no load balancer needed | 1 |
| 10,000 | 5,000 | Single load balancer distributes traffic to few servers | 2-4 |
| 1,000,000 | 500,000 | Multiple load balancers with health checks and failover | 100+ |
| 100,000,000 | 50,000,000 | Global load balancers, geo-distribution, CDN integration | Thousands |
Why load balancers distribute traffic in HLD - Scalability Evidence
At low traffic, one server can handle all requests. As users grow, a single server hits CPU and memory limits first. Without a load balancer, traffic cannot be split, causing slow responses or crashes.
Load balancers prevent this by spreading requests evenly, avoiding overload on any one server.
- Horizontal Scaling: Add more servers behind the load balancer to share traffic.
- Health Checks: Load balancers detect unhealthy servers and stop sending traffic to them.
- Session Persistence: Keep user sessions on the same server if needed.
- Global Load Balancing: Use multiple load balancers across regions for geo-distribution.
- Integration with CDN: Offload static content to CDN, reducing load on servers.
Assuming 1 server handles ~3000 RPS:
- At 10,000 RPS, need ~4 servers.
- At 500,000 RPS, need ~170 servers.
- Load balancer can handle ~10,000 concurrent connections per instance.
- Network bandwidth: 1 Gbps = 125 MB/s, so servers and load balancers must have sufficient network capacity.
Start by explaining the problem of single server limits. Then introduce load balancers as a solution to distribute traffic. Discuss how load balancers improve availability and fault tolerance. Finally, mention scaling strategies and monitoring.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Introduce load balancers to distribute traffic across multiple database replicas or application servers to avoid overload and maintain performance.