| Users / Traffic | Round Robin | Least Connections |
|---|---|---|
| 100 users | Simple rotation, evenly distributes requests easily. | Tracks connections, slightly more CPU but balances well. |
| 10,000 users | Still works well, but may cause uneven load if requests vary in length. | Better at balancing uneven loads by checking active connections. |
| 1,000,000 users | Needs multiple load balancers; risk of uneven load if sessions are sticky. | More CPU overhead; requires efficient connection tracking and state sharing. |
| 100,000,000 users | Single load balancer insufficient; must use distributed load balancers or DNS-based balancing. | High complexity; needs distributed state or stateless design to scale. |
Load balancing algorithms (round robin, least connections) in HLD - Scalability & System Analysis
The load balancer itself becomes the first bottleneck as traffic grows.
Round Robin is simple but can cause uneven load if requests vary in duration.
Least Connections requires tracking active connections, increasing CPU and memory usage on the load balancer.
At high scale, the load balancer's CPU and memory limits, plus network bandwidth, break first.
- Horizontal scaling: Add more load balancer instances behind a DNS or anycast layer.
- Load balancer clustering: Share connection state for least connections algorithm.
- Caching: Use session persistence or caching to reduce load balancer decisions.
- Algorithm choice: Use round robin for simple, uniform traffic; least connections for uneven loads.
- Offload: Use hardware load balancers or cloud-managed services to handle high throughput.
- Stateless design: Design backend to be stateless to simplify load balancing.
Assuming 1 server handles ~3000 concurrent connections:
- At 10,000 users, 4 load balancers can handle traffic with round robin easily.
- At 1 million users, need ~350 load balancers to handle connections if least connections algorithm is used.
- Network bandwidth: 1 Gbps load balancer can handle ~125 MB/s; high traffic requires multiple load balancers or hardware acceleration.
- CPU usage on load balancer increases with least connections due to connection tracking.
Start by explaining the two algorithms simply: round robin cycles through servers evenly; least connections sends requests to the server with the fewest active connections.
Discuss pros and cons: round robin is simple but may cause uneven load; least connections balances better but uses more resources.
Identify bottlenecks: load balancer CPU, memory, and network.
Suggest scaling solutions: horizontal scaling, clustering, algorithm choice based on traffic pattern.
Use real numbers to show understanding of limits and trade-offs.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Since the load balancer is the first bottleneck for increased traffic, add more load balancer instances and use horizontal scaling to distribute the load. Also, consider switching to least connections if traffic is uneven.