Bird
Raised Fist0
Microservicessystem_design~7 mins

Routing and load balancing in Microservices - System Design Guide

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Problem Statement
When all client requests go to a single server or service instance, that instance becomes a bottleneck causing slow responses and potential crashes. Without distributing requests properly, some servers may be overloaded while others remain idle, leading to inefficient resource use and poor user experience.
Solution
Routing and load balancing distribute incoming requests across multiple service instances or servers to spread the workload evenly. Routing directs requests based on rules like user location or service version, while load balancing ensures no single instance is overwhelmed by monitoring load and forwarding requests accordingly.
Architecture
Clients
Clients
Load Balancer
Load Balancer
Service A

This diagram shows clients sending requests to a load balancer, which routes requests to multiple service instances to balance the load.

Trade-offs
✓ Pros
Prevents any single server from becoming a bottleneck by distributing requests evenly.
Improves system availability by rerouting traffic if a server fails.
Enables scaling out by adding more service instances behind the load balancer.
Supports routing rules for directing traffic based on criteria like user location or version.
✗ Cons
Adds an extra network hop which can introduce slight latency.
Load balancer itself can become a single point of failure if not highly available.
Complexity increases with advanced routing rules and health checks.
Use when your system has multiple service instances handling requests and you expect traffic above 1000 requests per second or need high availability and fault tolerance.
Avoid if your system has only one instance or very low traffic (under 100 requests per second) where the overhead of load balancing outweighs benefits.
Real World Examples
Netflix
Uses load balancers to distribute streaming requests across edge servers, ensuring smooth playback and failover.
Uber
Routes ride requests to the nearest available driver service instance using geographic routing combined with load balancing.
Amazon
Balances API requests across multiple backend services to maintain low latency and high availability during peak shopping times.
Code Example
The before code calls a fixed service instance, risking overload and failure. The after code cycles through multiple instances, distributing requests evenly to balance load.
Microservices
import requests

### Before: No load balancing, direct calls to fixed service instance
class ServiceClient:
    def get_data(self):
        # Always calls the same service instance
        response = requests.get('http://service-instance-1/api/data')
        return response.json()


### After: Simple round-robin load balancer client
class LoadBalancerClient:
    def __init__(self):
        self.instances = [
            'http://service-instance-1/api/data',
            'http://service-instance-2/api/data',
            'http://service-instance-3/api/data'
        ]
        self.index = 0

    def get_data(self):
        url = self.instances[self.index]
        self.index = (self.index + 1) % len(self.instances)
        response = requests.get(url)
        return response.json()
OutputSuccess
Alternatives
Client-side load balancing
Clients decide which service instance to call, removing the need for a centralized load balancer.
Use when: Choose when you want to reduce infrastructure complexity and clients can maintain service instance lists.
DNS-based load balancing
Uses DNS to distribute requests by returning different IP addresses for the same domain.
Use when: Choose when you need simple load distribution without real-time health checks or fine-grained routing.
Summary
Routing and load balancing prevent server overload by distributing client requests across multiple service instances.
They improve system availability, scalability, and fault tolerance by directing traffic intelligently.
Choosing the right load balancing strategy depends on traffic volume, system complexity, and failure tolerance needs.

Practice

(1/5)
1. What is the main purpose of routing in a microservices architecture?
easy
A. To store data persistently across services
B. To monitor the health of microservices
C. To encrypt communication between services
D. To send requests to the correct microservice based on rules

Solution

  1. Step 1: Understand routing role

    Routing directs incoming requests to the right microservice based on predefined rules like URL paths or headers.
  2. Step 2: Differentiate routing from other functions

    Storing data, encrypting communication, and monitoring are separate concerns handled by databases, security layers, and monitoring tools respectively.
  3. Final Answer:

    To send requests to the correct microservice based on rules -> Option D
  4. Quick Check:

    Routing = directing requests [OK]
Hint: Routing directs requests to the right service [OK]
Common Mistakes:
  • Confusing routing with data storage
  • Mixing routing with security or monitoring
  • Thinking routing balances load
2. Which of the following is a correct syntax for defining a load balancer rule that forwards requests to multiple instances evenly?
easy
A. round_robin: [instance1, instance2, instance3]
B. loadbalance = {instance1; instance2; instance3}
C. balance->instances(instance1, instance2, instance3)
D. forward: instance1 | instance2 | instance3

Solution

  1. Step 1: Identify common load balancing syntax

    Round robin is a standard load balancing method cycling through instances evenly, often expressed as a list.
  2. Step 2: Evaluate options for correct syntax style

    round_robin: [instance1, instance2, instance3] uses a clear list with round_robin keyword, matching common config styles. Others use invalid or uncommon syntax.
  3. Final Answer:

    round_robin: [instance1, instance2, instance3] -> Option A
  4. Quick Check:

    Round robin uses list syntax [OK]
Hint: Look for standard list syntax with round robin keyword [OK]
Common Mistakes:
  • Using semicolons instead of commas
  • Incorrect assignment operators
  • Using arrows or pipes incorrectly
3. Given the following pseudo-code for a load balancer using weighted routing:
weights = {"serviceA": 3, "serviceB": 1}
requests = 8
for i in range(requests):
    target = weighted_choice(weights)
    print(target)
What is the expected number of requests routed to serviceA?
medium
A. 6
B. 8
C. 4
D. 2

Solution

  1. Step 1: Understand weighted routing concept

    Weights define how many times a service should receive requests relative to others. ServiceA has weight 3, serviceB has weight 1, total weight is 4.
  2. Step 2: Calculate expected requests for serviceA

    Out of 8 requests, serviceA should get (3/4)*8 = 6 requests on average.
  3. Final Answer:

    6 -> Option A
  4. Quick Check:

    Weighted share = 6 requests [OK]
Hint: Multiply total requests by service weight fraction [OK]
Common Mistakes:
  • Ignoring weights and dividing requests equally
  • Confusing total weight with individual weights
  • Calculating requests for serviceB instead
4. A load balancer is configured with the following rule:
if (instance.isHealthy()) {
  forwardRequest(instance)
} else {
  skipInstance(instance)
}
However, requests are still being sent to unhealthy instances. What is the most likely cause?
medium
A. Instances are overloaded but still marked healthy
B. Health check logic is not integrated with the load balancer
C. Load balancer is using round robin instead of weighted routing
D. Routing rules are missing URL path matching

Solution

  1. Step 1: Analyze health check integration

    The code shows a health check condition, but if the load balancer does not actually use this logic, unhealthy instances may still receive traffic.
  2. Step 2: Evaluate other options for relevance

    Round robin vs weighted routing does not affect health checks. Overload does not mark instances unhealthy. URL path matching is unrelated to health status.
  3. Final Answer:

    Health check logic is not integrated with the load balancer -> Option B
  4. Quick Check:

    Health check integration = key [OK]
Hint: Check if health logic is actually used by load balancer [OK]
Common Mistakes:
  • Assuming routing method affects health checks
  • Confusing overload with health status
  • Ignoring missing integration of health logic
5. You need to design a routing and load balancing system for a microservices app that handles both user requests and background jobs. User requests must be routed based on URL paths, and load balanced evenly. Background jobs should be routed to a separate set of instances with weighted load balancing. Which architecture best fits this requirement?
hard
A. Use DNS-based routing to split traffic, then apply round robin load balancing on all instances
B. Deploy two separate load balancers, one for user requests with weighted balancing, another for jobs with round robin
C. Use a single load balancer with path-based routing directing to two target groups; one uses round robin, the other weighted balancing
D. Route all traffic to a single instance that forwards requests internally based on type

Solution

  1. Step 1: Identify routing needs for user requests and jobs

    User requests require path-based routing to separate them from background jobs, which need different load balancing strategies.
  2. Step 2: Choose architecture supporting both routing and load balancing rules

    A single load balancer with path-based routing can direct traffic to two target groups. One group uses round robin for user requests, the other weighted for jobs, meeting all requirements efficiently.
  3. Final Answer:

    Use a single load balancer with path-based routing directing to two target groups; one uses round robin, the other weighted balancing -> Option C
  4. Quick Check:

    Path-based routing + mixed balancing = Use a single load balancer with path-based routing directing to two target groups; one uses round robin, the other weighted balancing [OK]
Hint: Combine path routing with separate load balancing per target group [OK]
Common Mistakes:
  • Using weighted balancing for user requests instead of round robin
  • Splitting with DNS which lacks path awareness
  • Routing all traffic to one instance causing bottlenecks