HLDsystem_design~7 mins

Load balancing algorithms (round robin, least connections) in HLD - System Design Guide

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Problem Statement

When a single server handles all incoming requests, it quickly becomes overwhelmed, causing slow responses and potential crashes. Without distributing traffic evenly, some servers stay idle while others get overloaded, leading to poor resource use and unstable service.

Solution

Load balancing algorithms distribute incoming requests across multiple servers to prevent overload. Round robin assigns requests in a fixed order, cycling through servers evenly. Least connections sends requests to the server with the fewest active connections, balancing load dynamically based on current usage.

Architecture

Load

↓

Server1

This diagram shows a load balancer distributing incoming requests to multiple servers using an algorithm like round robin or least connections.

Trade-offs

✓ Pros

→

Round robin is simple to implement and ensures even distribution when servers have similar capacity.

→

Least connections adapts to server load dynamically, improving performance under uneven request durations.

→

Both algorithms improve system availability by preventing any single server from becoming a bottleneck.

✗ Cons

→

Round robin does not consider server load or capacity differences, which can cause uneven performance if servers vary.

→

Least connections requires tracking active connections, adding overhead and complexity to the load balancer.

→

Neither algorithm handles server failures inherently; additional health checks are needed.

Use round robin when servers have similar capacity and request processing times are uniform. Use least connections when request durations vary significantly or server capacities differ.

Avoid round robin when servers have different performance or request loads vary widely. Avoid least connections in very high throughput systems where tracking connections adds unacceptable overhead.

Real World Examples

Netflix

Netflix uses least connections load balancing to route streaming requests to edge servers with the fewest active streams, ensuring smooth playback.

Amazon

Amazon employs round robin load balancing for distributing API requests evenly across identical backend servers to maintain consistent response times.

Uber

Uber uses least connections to balance ride request traffic dynamically among servers, adapting to fluctuating demand and server load.

Alternatives

Consistent Hashing

Routes requests based on a hash of client identifiers to ensure the same client hits the same server, improving cache locality.

Use when: Choose when session stickiness or cache affinity is critical, such as in user-specific data caching.

Weighted Round Robin

Extends round robin by assigning weights to servers based on capacity, sending more requests to stronger servers.

Use when: Choose when servers have different capacities but request durations are similar.

Summary

Load balancing algorithms distribute incoming requests to prevent server overload and improve availability.

Round robin cycles through servers evenly without considering load, suitable for uniform servers and requests.

Least connections routes to the server with the fewest active connections, adapting to dynamic load differences.