Microservicessystem_design~10 mins

Routing and load balancing in Microservices - Scalability & System Analysis

Choose your learning style10 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Scalability Analysis - Routing and load balancing

Growth Table: Routing and Load Balancing at Different Scales

Users	Requests per Second (RPS)	Routing Complexity	Load Balancer Setup	Network Traffic
100 users	~50 RPS	Simple routing, single load balancer	One load balancer instance	Low, easily handled by single server
10,000 users	~5,000 RPS	Multiple microservices, routing rules grow	Multiple load balancers with health checks	Moderate, requires monitoring
1,000,000 users	~500,000 RPS	Complex routing, service discovery needed	Distributed load balancers, global traffic management	High, needs optimized network and CDN
100,000,000 users	~50,000,000 RPS	Highly dynamic routing, multi-region failover	Hierarchical load balancing, edge routing, global DNS	Very high, requires advanced network infra

First Bottleneck

At small scale, the load balancer server CPU and memory become the first bottleneck because it must handle all incoming requests and route them correctly. As traffic grows, the routing logic and service discovery can slow down, causing delays. At medium scale, network bandwidth and latency between load balancers and microservices become critical. At large scale, global routing and failover complexity cause bottlenecks if not properly distributed.

Scaling Solutions

Horizontal Scaling: Add more load balancer instances behind a DNS or anycast IP to distribute traffic.
Service Discovery: Use dynamic service registries to keep routing updated without manual config.
Caching: Cache routing decisions or DNS lookups to reduce latency.
Sharding: Partition traffic by user region or service type to reduce load per balancer.
CDN and Edge Routing: Offload static content and route users to nearest data center.
Global Load Balancing: Use DNS-based or geo-aware load balancing for multi-region failover.

Back-of-Envelope Cost Analysis

Assuming 1 million users generate ~500,000 RPS:

Each load balancer can handle ~5,000 concurrent connections and ~10,000 RPS.
Number of load balancers needed: 500,000 / 10,000 = 50 instances minimum.
Network bandwidth: If average request size is 10 KB, total bandwidth = 500,000 * 10 KB = ~5 GB/s (~40 Gbps).
Storage is minimal for routing but logs and metrics storage grows with traffic.

Interview Tip

Start by explaining the role of routing and load balancing in microservices. Discuss how traffic grows and what breaks first. Then, describe scaling strategies step-by-step: horizontal scaling, service discovery, caching, and global load balancing. Use real numbers to show understanding of capacity and bottlenecks.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Since the database is the bottleneck at 1000 QPS, and traffic grows to 10,000 QPS, the first step is to add read replicas and implement caching to reduce direct database load before scaling application servers or load balancers.

Key Result

Routing and load balancing scale by adding more load balancer instances, using service discovery for dynamic routing, and distributing traffic globally to avoid bottlenecks in CPU, memory, and network bandwidth.

Practice

(1/5)

1. What is the main purpose of routing in a microservices architecture?

easy

A. To store data persistently across services

B. To monitor the health of microservices

C. To encrypt communication between services

D. To send requests to the correct microservice based on rules

Routing and load balancing in Microservices - Scalability & System Analysis

Start learning this pattern below

Practice

Solution

Step 1: Understand routing role

Step 2: Differentiate routing from other functions

Final Answer:

Quick Check:

Solution

Step 1: Identify common load balancing syntax

Step 2: Evaluate options for correct syntax style

Final Answer:

Quick Check:

Solution

Step 1: Understand weighted routing concept

Step 2: Calculate expected requests for serviceA

Final Answer:

Quick Check:

Solution

Step 1: Analyze health check integration

Step 2: Evaluate other options for relevance

Final Answer:

Quick Check:

Solution

Step 1: Identify routing needs for user requests and jobs

Step 2: Choose architecture supporting both routing and load balancing rules

Final Answer:

Quick Check: