0
0
HLDsystem_design~7 mins

Load balancing algorithms (round robin, least connections) in HLD - System Design Guide

Choose your learning style9 modes available
Problem Statement
When a single server handles all incoming requests, it quickly becomes overwhelmed, causing slow responses and potential crashes. Without distributing traffic evenly, some servers stay idle while others get overloaded, leading to poor resource use and unstable service.
Solution
Load balancing algorithms distribute incoming requests across multiple servers to prevent overload. Round robin assigns requests in a fixed order, cycling through servers evenly. Least connections sends requests to the server with the fewest active connections, balancing load dynamically based on current usage.
Architecture
Load
Load
Server1

This diagram shows a load balancer distributing incoming requests to multiple servers using an algorithm like round robin or least connections.

Trade-offs
✓ Pros
Round robin is simple to implement and ensures even distribution when servers have similar capacity.
Least connections adapts to server load dynamically, improving performance under uneven request durations.
Both algorithms improve system availability by preventing any single server from becoming a bottleneck.
✗ Cons
Round robin does not consider server load or capacity differences, which can cause uneven performance if servers vary.
Least connections requires tracking active connections, adding overhead and complexity to the load balancer.
Neither algorithm handles server failures inherently; additional health checks are needed.
Use round robin when servers have similar capacity and request processing times are uniform. Use least connections when request durations vary significantly or server capacities differ.
Avoid round robin when servers have different performance or request loads vary widely. Avoid least connections in very high throughput systems where tracking connections adds unacceptable overhead.
Real World Examples
Netflix
Netflix uses least connections load balancing to route streaming requests to edge servers with the fewest active streams, ensuring smooth playback.
Amazon
Amazon employs round robin load balancing for distributing API requests evenly across identical backend servers to maintain consistent response times.
Uber
Uber uses least connections to balance ride request traffic dynamically among servers, adapting to fluctuating demand and server load.
Alternatives
Consistent Hashing
Routes requests based on a hash of client identifiers to ensure the same client hits the same server, improving cache locality.
Use when: Choose when session stickiness or cache affinity is critical, such as in user-specific data caching.
Weighted Round Robin
Extends round robin by assigning weights to servers based on capacity, sending more requests to stronger servers.
Use when: Choose when servers have different capacities but request durations are similar.
Summary
Load balancing algorithms distribute incoming requests to prevent server overload and improve availability.
Round robin cycles through servers evenly without considering load, suitable for uniform servers and requests.
Least connections routes to the server with the fewest active connections, adapting to dynamic load differences.