HLDsystem_design~7 mins

Global server load balancing (GSLB) in HLD - System Design Guide

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Problem Statement

When a single data center or server cluster handles all user requests globally, it becomes a bottleneck causing slow response times and outages during traffic spikes or failures. Users far from the server experience high latency, and if the data center goes down, the entire service becomes unavailable worldwide.

Solution

Global server load balancing distributes user requests across multiple geographically dispersed data centers or server clusters. It directs traffic based on factors like proximity, server health, and load, ensuring users connect to the nearest or best-performing server. This reduces latency, improves availability, and provides failover if one location fails.

Architecture

DNS Resolver

→GSLB Controller

↓

Data Center 1

→Data Center 2

↓

Data Center 3

This diagram shows how the DNS resolver queries the GSLB controller, which routes user requests to the nearest or healthiest data center among multiple global locations.

Trade-offs

✓ Pros

→

Reduces latency by directing users to the closest data center.

→

Improves availability with automatic failover across regions.

→

Balances load globally to prevent any single data center from overload.

→

Enhances disaster recovery by distributing traffic geographically.

✗ Cons

→

Adds complexity in DNS configuration and health monitoring.

→

Propagation delays in DNS changes can cause temporary routing issues.

→

Requires consistent data synchronization across data centers to avoid stale data.

Use when serving a global user base with latency-sensitive applications and when uptime must be maintained despite regional failures. Typically beneficial at traffic volumes exceeding tens of thousands of requests per second globally.

Avoid when your user base is localized to a single region or when traffic volume is low (under a few thousand requests per second), as the added complexity and cost outweigh benefits.

Real World Examples

Netflix

Uses GSLB to route streaming requests to the nearest edge server cluster, reducing buffering and improving video start times globally.

Amazon

Employs GSLB to direct shoppers to the closest regional data center, ensuring fast page loads and high availability during peak sales.

Google

Implements GSLB to balance search and cloud service requests across multiple global data centers, optimizing latency and fault tolerance.

Alternatives

Anycast Routing

Routes user requests to the nearest data center using network-level IP routing instead of DNS-based resolution.

Use when: Choose when ultra-low latency is critical and network infrastructure supports Anycast, typically for CDN or DNS services.

Regional Load Balancing

Distributes traffic only within a single region or data center rather than globally.

Use when: Choose when your user base is regional or when global distribution is not required.

Content Delivery Network (CDN)

Caches static content closer to users globally but does not balance dynamic application traffic across data centers.

Use when: Choose when primarily serving static assets and reducing bandwidth costs rather than full application load balancing.

Summary

Global server load balancing directs user requests to the best data center worldwide to reduce latency and improve availability.

It uses DNS-based routing considering proximity, health, and load to distribute traffic across regions.

GSLB is essential for global services needing fault tolerance and fast response times but adds complexity and requires data synchronization.