HLDsystem_design~25 mins

Layer 4 vs Layer 7 load balancing in HLD - Design Approaches Compared

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: Load Balancing System

Design focuses on the load balancer components and their differences at Layer 4 and Layer 7. Out of scope are backend server implementations and client-side details.

Functional Requirements

FR1: Distribute incoming client requests efficiently across multiple backend servers

FR2: Support both Layer 4 (Transport Layer) and Layer 7 (Application Layer) load balancing

FR3: Ensure high availability and fault tolerance

FR4: Provide low latency request routing

FR5: Support health checks for backend servers

FR6: Allow session persistence (sticky sessions) when needed

Non-Functional Requirements

NFR1: Handle up to 50,000 concurrent connections

NFR2: API response latency p99 under 100ms

NFR3: Availability target of 99.9% uptime

NFR4: Support both TCP and HTTP/HTTPS protocols

NFR5: Scalable to add more backend servers without downtime

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

❓ Question 6

Key Components

Load balancer (Layer 4 and Layer 7)

Health check service

Backend server pool

Session persistence store

Monitoring and logging system

Design Patterns

Round-robin and least connections load balancing

Sticky sessions using cookies or IP hashing

SSL termination and passthrough

Content-based routing (URL, headers)

Failover and retry mechanisms

Reference Architecture

Client
  |
  | TCP/HTTP Requests
  v
+-------------------+
|   Load Balancer   |
|  +-------------+  |
|  | Layer 4 LB  |  |---> Backend Servers (TCP level routing)
|  +-------------+  |
|                   |
|  +-------------+  |
|  | Layer 7 LB  |  |---> Backend Servers (HTTP routing, content-based)
|  +-------------+  |
+-------------------+
       |
       v
  Health Checks
       |
       v
Backend Server Pool

Components

Layer 4 Load Balancer

IP Hashing, TCP Proxy

Routes traffic based on IP and port without inspecting application data, fast and low latency

Layer 7 Load Balancer

HTTP Proxy, Reverse Proxy (e.g., NGINX, Envoy)

Inspects HTTP headers and content to route requests based on URL, cookies, or headers

Health Check Service

Periodic TCP/HTTP probes

Monitors backend server health to avoid routing to unhealthy servers

Backend Server Pool

Application servers

Handles actual client requests

Session Persistence Store

In-memory store (e.g., Redis) or sticky session mechanism

Maintains session affinity for clients requiring sticky sessions

Request Flow

1. Client sends request to load balancer IP and port

2. Layer 4 load balancer routes request based on IP and port without inspecting payload

3. If Layer 7 load balancing is enabled, request is forwarded to Layer 7 proxy

4. Layer 7 load balancer inspects HTTP headers and content to decide backend server

5. Load balancer performs health check to ensure backend server is healthy

6. Request is forwarded to selected backend server

7. Backend server processes request and sends response back through load balancer

8. Load balancer forwards response to client

9. Session persistence is maintained if configured, using cookies or IP hashing

Database Schema

Not applicable as this design focuses on load balancing components and routing logic rather than persistent data storage.

Scaling Discussion

Bottlenecks

Load balancer CPU and memory limits under high connection rates

Latency increase due to deep packet inspection at Layer 7

Single point of failure if load balancer is not redundant

Session persistence store becoming a bottleneck

Health check frequency causing overhead

Solutions

Use multiple load balancer instances with DNS or anycast for redundancy

Offload SSL termination to dedicated hardware or use optimized libraries

Implement horizontal scaling for Layer 7 proxies with consistent hashing

Use distributed in-memory stores for session persistence with replication

Tune health check intervals and use adaptive health checks

Interview Tips

Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing the architecture and explaining Layer 4 vs Layer 7 differences, 10 minutes discussing scaling and trade-offs, 5 minutes for questions.

Explain the difference between Layer 4 and Layer 7 load balancing clearly

Discuss trade-offs: speed vs flexibility

Highlight how session persistence is handled differently

Mention health checks and fault tolerance

Discuss scaling strategies and avoiding single points of failure