0
0
HLDsystem_design~15 mins

Why load balancers distribute traffic in HLD - Why It Works This Way

Choose your learning style9 modes available
Overview - Why load balancers distribute traffic
What is it?
A load balancer is a system that shares incoming user requests across multiple servers. It helps make sure no single server gets overwhelmed by too many requests at once. This way, the system stays fast and reliable even when many people use it at the same time. Load balancers act like traffic managers for internet services.
Why it matters
Without load balancers, one server could get too busy and slow down or crash, causing delays or outages for users. This would make websites and apps unreliable and frustrating to use. Load balancers help keep services available and responsive, especially during busy times or sudden spikes in traffic. They make sure users get quick responses and the system stays healthy.
Where it fits
Before learning about load balancers, you should understand basic web servers and how clients send requests to them. After this, you can learn about scaling systems horizontally, fault tolerance, and advanced routing techniques. Load balancers are a key step in building systems that handle lots of users smoothly.
Mental Model
Core Idea
Load balancers evenly spread user requests across multiple servers to keep systems fast, reliable, and scalable.
Think of it like...
Imagine a busy restaurant with many customers arriving at once. Instead of all customers crowding one waiter, a host directs each new customer to the waiter with the fewest tables. This way, no waiter is overwhelmed, and everyone gets served quickly.
┌───────────────┐
│   Clients     │
└──────┬────────┘
       │ Requests
       ▼
┌───────────────┐
│ Load Balancer │
└──────┬────────┘
       │ Distributes
       ▼
┌──────┬───────┬──────┐
│Server│Server │Server│
│  1   │  2    │  3   │
└──────┴───────┴──────┘
Build-Up - 7 Steps
1
FoundationWhat is a load balancer?
🤔
Concept: Introducing the basic role of a load balancer in a system.
A load balancer is a device or software that receives all incoming requests from users and sends each request to one of many servers. It acts as a single point that users connect to, hiding the complexity of multiple servers behind it.
Result
Users connect to one address, but their requests are spread across many servers.
Understanding the load balancer as the traffic controller helps grasp why it is essential for managing many users.
2
FoundationWhy servers need help handling traffic
🤔
Concept: Explaining the problem of server overload and slow responses.
If all users connect to one server, it can get overwhelmed, causing slow responses or crashes. Servers have limits on how many requests they can handle at once. When too many requests come in, performance drops.
Result
Single servers become bottlenecks and reduce system reliability.
Knowing server limits clarifies why spreading requests is necessary for good user experience.
3
IntermediateHow load balancers distribute requests
🤔Before reading on: do you think load balancers send requests randomly or based on server status? Commit to your answer.
Concept: Introducing common methods load balancers use to decide where to send requests.
Load balancers use strategies like round-robin (sending requests in order), least connections (sending to the server with fewest active users), or health checks (only sending to servers that are working well). These methods help balance load fairly and avoid sending traffic to broken servers.
Result
Requests are distributed efficiently, improving speed and uptime.
Understanding distribution methods reveals how load balancers optimize resource use and avoid failures.
4
IntermediateTypes of load balancers
🤔Before reading on: do you think load balancers only work at one network layer or multiple? Commit to your answer.
Concept: Explaining different layers where load balancers operate and their roles.
Load balancers can work at the network level (Layer 4) directing traffic based on IP and port, or at the application level (Layer 7) inspecting content like URLs or cookies to make smarter routing decisions. Application-level load balancers can do more complex tasks like routing users to specific servers based on their requests.
Result
Different load balancers fit different needs, from simple to complex routing.
Knowing the layers helps choose the right load balancer type for specific system requirements.
5
IntermediateHealth checks and failover handling
🤔
Concept: How load balancers detect and avoid sending traffic to unhealthy servers.
Load balancers regularly check if servers respond correctly. If a server is down or slow, the load balancer stops sending requests to it until it recovers. This prevents users from experiencing errors or delays caused by broken servers.
Result
Systems stay reliable even when some servers fail.
Understanding health checks shows how load balancers improve fault tolerance and user experience.
6
AdvancedScaling with load balancers in production
🤔Before reading on: do you think one load balancer is enough for very large systems? Commit to your answer.
Concept: Exploring how load balancers themselves scale and avoid becoming bottlenecks.
In large systems, a single load balancer can become a bottleneck or point of failure. To avoid this, multiple load balancers are used in clusters or hierarchies. Techniques like DNS load balancing or anycast routing distribute traffic across load balancers themselves. This ensures the system scales horizontally and stays highly available.
Result
Load balancing scales beyond just servers to the load balancers themselves.
Knowing load balancer scaling prevents new bottlenecks and supports very large, reliable systems.
7
ExpertAdvanced routing and session persistence
🤔Before reading on: do you think all user requests can go to any server, or do some need to stick to one? Commit to your answer.
Concept: How load balancers handle complex cases like user sessions that require consistent server connections.
Some applications need a user's requests to go to the same server to keep session data consistent. Load balancers use techniques like sticky sessions or session affinity to remember which server a user connected to and send future requests there. This adds complexity but is crucial for stateful applications.
Result
User sessions remain consistent without losing data or causing errors.
Understanding session persistence reveals the balance between load distribution and application requirements.
Under the Hood
Load balancers intercept incoming network requests and use algorithms to select a backend server. They maintain state information like active connections and server health. At the network level, they modify packet headers to forward requests. At the application level, they parse request data to make routing decisions. Health checks run periodically to update server status. Load balancers can be hardware devices, software processes, or cloud services.
Why designed this way?
Load balancers were designed to solve the problem of limited capacity and reliability of single servers. Early systems used simple round-robin methods, but as applications grew complex, smarter algorithms and health checks were added. The design balances simplicity, performance, and fault tolerance. Alternatives like client-side load balancing exist but add complexity to clients. Centralized load balancers simplify management and improve control.
┌───────────────┐
│   Clients     │
└──────┬────────┘
       │ Requests
       ▼
┌───────────────┐
│ Load Balancer │
├───────────────┤
│ Health Checks │
│ Distribution  │
│ Algorithms    │
└──────┬────────┘
       │ Forwards
       ▼
┌──────┬───────┬──────┐
│Server│Server │Server│
│  1   │  2    │  3   │
└──────┴───────┴──────┘
Myth Busters - 4 Common Misconceptions
Quick: Do load balancers always send requests randomly? Commit to yes or no.
Common Belief:Load balancers just send requests randomly to servers.
Tap to reveal reality
Reality:Load balancers use specific algorithms like round-robin, least connections, or health-based routing to distribute traffic efficiently.
Why it matters:Random distribution can overload some servers and cause poor performance or downtime.
Quick: Can one load balancer handle unlimited traffic without issues? Commit to yes or no.
Common Belief:A single load balancer can handle any amount of traffic without becoming a bottleneck.
Tap to reveal reality
Reality:Load balancers themselves can become bottlenecks and need to be scaled or clustered for very large systems.
Why it matters:Ignoring load balancer scaling risks system outages or slowdowns under heavy load.
Quick: Do all user requests go to any server interchangeably? Commit to yes or no.
Common Belief:User requests can be sent to any server without issues.
Tap to reveal reality
Reality:Some applications require session persistence, so requests from the same user must go to the same server.
Why it matters:Failing to maintain session affinity can cause errors or lost data in stateful applications.
Quick: Are load balancers only useful for web servers? Commit to yes or no.
Common Belief:Load balancers are only needed for web servers.
Tap to reveal reality
Reality:Load balancers are used for many types of services, including databases, APIs, and messaging systems.
Why it matters:Limiting load balancers to web servers misses their broader role in system reliability and scalability.
Expert Zone
1
Load balancers can introduce latency; choosing the right algorithm balances speed and fairness.
2
Health checks must be carefully designed to avoid false positives that remove healthy servers or false negatives that keep unhealthy ones.
3
Session persistence can reduce load balancing effectiveness and must be used only when necessary.
When NOT to use
Load balancers are not ideal for very simple systems with low traffic where added complexity is unnecessary. Client-side load balancing or DNS-based distribution can be alternatives. Also, for some peer-to-peer or decentralized systems, load balancers are not applicable.
Production Patterns
In production, load balancers are often deployed in pairs or clusters for high availability. Cloud providers offer managed load balancers with auto-scaling and integrated health checks. Advanced setups use global load balancing to route users to the nearest data center. Sticky sessions are used selectively with caching or shared session stores to maintain performance.
Connections
DNS Load Balancing
Builds-on
Understanding load balancers helps grasp how DNS can distribute traffic by resolving domain names to multiple IPs, but with less control and slower reaction to failures.
Operating System Process Scheduling
Similar pattern
Both load balancers and OS schedulers distribute work (requests or CPU time) across resources to optimize performance and avoid overload.
Traffic Management in Urban Planning
Analogous concept
Just like load balancers direct network traffic to avoid jams, urban planners design road systems and traffic lights to prevent congestion and keep cars moving smoothly.
Common Pitfalls
#1Sending traffic to servers without health checks.
Wrong approach:Load balancer forwards requests to all servers regardless of their status.
Correct approach:Load balancer performs regular health checks and only sends requests to healthy servers.
Root cause:Assuming all servers are always available leads to sending requests to down servers, causing errors.
#2Using sticky sessions unnecessarily for all applications.
Wrong approach:Load balancer forces all user requests to the same server even when not needed.
Correct approach:Use session persistence only when the application requires it; otherwise, distribute requests freely.
Root cause:Misunderstanding session needs reduces load balancing effectiveness and scalability.
#3Relying on a single load balancer without redundancy.
Wrong approach:Deploying one load balancer without backup or clustering.
Correct approach:Use multiple load balancers in active-active or active-passive setups for high availability.
Root cause:Ignoring load balancer failure risks system downtime.
Key Takeaways
Load balancers distribute user requests across multiple servers to improve speed and reliability.
They use algorithms and health checks to send traffic efficiently and avoid broken servers.
Different types of load balancers operate at network or application layers for varying complexity.
Scaling load balancers themselves is crucial for very large systems to prevent new bottlenecks.
Session persistence balances load distribution with application needs for consistent user experience.