| Users / Traffic | Routing Complexity | Splitting Use Cases | Infrastructure Needs | Monitoring & Control |
|---|---|---|---|---|
| 100 users | Simple routing rules, mostly static | Rare, manual splitting for testing | Single load balancer, minimal proxies | Basic logs and alerts |
| 10,000 users | Dynamic routing based on service health | Canary releases, A/B testing starts | Multiple load balancers, API gateways | Real-time monitoring dashboards |
| 1,000,000 users | Advanced routing with weighted splits, geo-routing | Automated traffic splitting for experiments | Distributed proxies, service mesh adoption | Automated anomaly detection, tracing |
| 100,000,000 users | Global traffic management, multi-region routing | Complex multi-dimensional splits (device, region, version) | Global DNS, edge proxies, multi-cloud | AI-driven traffic control, self-healing |
Traffic management (routing, splitting) in Microservices - Scalability & System Analysis
Start learning this pattern below
Jump into concepts and practice - no test required
At low to medium scale, the first bottleneck is the routing layer such as API gateways or load balancers. They can become overwhelmed by the number of routing rules and traffic volume, causing increased latency or failures.
As traffic grows, service discovery and configuration management also become bottlenecks, since routing decisions depend on up-to-date service health and versions.
- Horizontal scaling: Add more instances of API gateways and proxies to distribute routing load.
- Service mesh: Offload routing and splitting logic to sidecars for decentralized control.
- Caching routing decisions: Reduce repeated lookups by caching routing rules locally.
- Weighted routing and traffic splitting: Use dynamic weights to gradually shift traffic during deployments.
- Global traffic management: Use DNS-based geo-routing and edge proxies for global scale.
- Automation: Automate routing updates and health checks to avoid stale routes.
- At 1M users with 1 request per second each, expect ~1 million requests per second (QPS) at peak.
- Each API gateway instance can handle ~5,000 QPS, so ~200 instances needed for routing layer.
- Service mesh sidecars add CPU and memory overhead per service instance.
- Bandwidth depends on request size; for 1 KB requests, 1M QPS = ~1 GB/s network traffic.
- Storage for routing configs and logs grows with number of rules and traffic volume.
Start by explaining the routing and splitting needs at different traffic scales. Identify the first bottleneck clearly (usually routing layer). Then discuss specific scaling techniques like horizontal scaling, service mesh, and automation. Use real numbers to justify your approach. Finally, mention monitoring and fallback strategies to maintain reliability.
Question: Your routing layer handles 1,000 QPS. Traffic grows 10x to 10,000 QPS. What is your first action and why?
Answer: Add more routing instances (horizontal scaling) and implement load balancing to distribute traffic. This prevents overload and maintains low latency.
Practice
Solution
Step 1: Understand traffic routing
Traffic routing means sending requests to the right service based on rules like URL path or user type.Step 2: Identify the main purpose
Routing helps control where requests go, ensuring they reach the correct microservice.Final Answer:
To direct incoming requests to specific services based on rules -> Option AQuick Check:
Routing = directing requests [OK]
- Confusing routing with data storage
- Thinking routing encrypts data
- Mixing routing with monitoring
Solution
Step 1: Understand traffic splitting syntax
Traffic splitting uses weights to divide requests between service versions, e.g., 50% to v1 and 50% to v2.Step 2: Identify correct syntax
split: - weight: 50 service: v1 - weight: 50 service: v2 correctly assigns weights to services for splitting. Other options mix routing and splitting or have invalid weight placement.Final Answer:
split: - weight: 50 service: v1 - weight: 50 service: v2 -> Option AQuick Check:
Splitting uses weights per service [OK]
- Confusing routing rules with splitting rules
- Missing weights in splitting definitions
- Placing weights outside service entries
split:
- weight: 70
service: v1
- weight: 30
service: v2Solution
Step 1: Read the weights for each service
Service v1 has weight 70, and service v2 has weight 30.Step 2: Calculate percentage for v2
Total weight = 70 + 30 = 100. So, v2 gets 30/100 = 30% of requests.Final Answer:
30% -> Option DQuick Check:
Weight 30 means 30% traffic [OK]
- Adding weights incorrectly
- Assuming equal split without weights
- Confusing service names
route: path: /user service: user-service-v1 weight: 100But requests to
/user/profile are not reaching user-service-v1. What is the likely problem?Solution
Step 1: Analyze the path matching rule
The rule matches exactly /user, but /user/profile is a subpath and may not match unless wildcard or prefix matching is used.Step 2: Identify why requests fail
Since /user/profile does not match exactly /user, requests do not route to user-service-v1.Final Answer:
The path rule matches only exact /user, not subpaths like /user/profile -> Option CQuick Check:
Exact path matching excludes subpaths [OK]
- Assuming weight must be split
- Blaming service name without checking
- Thinking routing ignores paths
Solution
Step 1: Understand gradual rollout needs
Gradual rollout means controlling what percentage of users see the new version.Step 2: Choose traffic management method
Traffic splitting with weights allows precise control of request percentages to each version.Step 3: Evaluate other options
Routing by URL path cannot split traffic by percentage. Random load balancing lacks control. Deploying without control risks all users seeing new version.Final Answer:
Use traffic splitting with weights 90% to old and 10% to new service -> Option BQuick Check:
Splitting controls rollout percentages [OK]
- Using URL path routing for percentage split
- Ignoring traffic control during rollout
- Relying on random load balancing
