HLDsystem_design~10 mins

Load balancing algorithms (round robin, least connections) in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - Load balancing algorithms (round robin, least connections)

Growth Table: Load Balancing Algorithms

Users / Traffic	Round Robin	Least Connections
100 users	Simple rotation, evenly distributes requests easily.	Tracks connections, slightly more CPU but balances well.
10,000 users	Still works well, but may cause uneven load if requests vary in length.	Better at balancing uneven loads by checking active connections.
1,000,000 users	Needs multiple load balancers; risk of uneven load if sessions are sticky.	More CPU overhead; requires efficient connection tracking and state sharing.
100,000,000 users	Single load balancer insufficient; must use distributed load balancers or DNS-based balancing.	High complexity; needs distributed state or stateless design to scale.

First Bottleneck

The load balancer itself becomes the first bottleneck as traffic grows.

Round Robin is simple but can cause uneven load if requests vary in duration.

Least Connections requires tracking active connections, increasing CPU and memory usage on the load balancer.

At high scale, the load balancer's CPU and memory limits, plus network bandwidth, break first.

Scaling Solutions

Horizontal scaling: Add more load balancer instances behind a DNS or anycast layer.
Load balancer clustering: Share connection state for least connections algorithm.
Caching: Use session persistence or caching to reduce load balancer decisions.
Algorithm choice: Use round robin for simple, uniform traffic; least connections for uneven loads.
Offload: Use hardware load balancers or cloud-managed services to handle high throughput.
Stateless design: Design backend to be stateless to simplify load balancing.

Back-of-Envelope Cost Analysis

Assuming 1 server handles ~3000 concurrent connections:

At 10,000 users, 4 load balancers can handle traffic with round robin easily.
At 1 million users, need ~350 load balancers to handle connections if least connections algorithm is used.
Network bandwidth: 1 Gbps load balancer can handle ~125 MB/s; high traffic requires multiple load balancers or hardware acceleration.
CPU usage on load balancer increases with least connections due to connection tracking.

Interview Tip

Start by explaining the two algorithms simply: round robin cycles through servers evenly; least connections sends requests to the server with the fewest active connections.

Discuss pros and cons: round robin is simple but may cause uneven load; least connections balances better but uses more resources.

Identify bottlenecks: load balancer CPU, memory, and network.

Suggest scaling solutions: horizontal scaling, clustering, algorithm choice based on traffic pattern.

Use real numbers to show understanding of limits and trade-offs.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Since the load balancer is the first bottleneck for increased traffic, add more load balancer instances and use horizontal scaling to distribute the load. Also, consider switching to least connections if traffic is uneven.

Key Result

Load balancers become bottlenecks as traffic grows; horizontal scaling and algorithm choice (round robin vs least connections) are key to handle increasing load efficiently.