| Users | Requests per Second (RPS) | API Gateway Load | Microservices Load | Network Traffic | Notes |
|---|---|---|---|---|---|
| 100 users | ~50 RPS | Single instance handles easily | Microservices handle requests directly | Low | Simple setup, no scaling needed |
| 10,000 users | ~5,000 RPS | API Gateway needs horizontal scaling | Microservices start to see increased load | Moderate | Introduce load balancer, caching at gateway |
| 1,000,000 users | ~500,000 RPS | Multiple API Gateway instances behind LB | Microservices require scaling and partitioning | High | Use caching, rate limiting, and circuit breakers |
| 100,000,000 users | ~50,000,000 RPS | Global distributed API Gateways with CDN | Microservices sharded and geo-distributed | Very High | Advanced routing, edge caching, and autoscaling |
API Gateway pattern in Microservices - Scalability & System Analysis
The API Gateway becomes the first bottleneck as it handles all incoming requests. At moderate to high traffic (around 10,000 users or 5,000 RPS), a single gateway instance struggles with CPU and network limits. This causes increased latency and potential request drops.
- Horizontal Scaling: Add multiple API Gateway instances behind a load balancer to distribute traffic evenly.
- Caching: Implement response caching at the gateway to reduce calls to microservices.
- Rate Limiting: Protect backend services by limiting request rates per user or IP.
- Sharding Microservices: Partition microservices by user region or function to reduce load.
- CDN Integration: Use Content Delivery Networks for static content and edge caching to reduce gateway load.
- Circuit Breakers: Prevent cascading failures by stopping calls to failing microservices.
- At 10,000 users (~5,000 RPS):
- API Gateway CPU: ~50% utilization per instance (assuming 2,000 RPS per instance)
- Network bandwidth: ~500 Mbps (assuming 100 KB per request/response)
- Storage: Minimal at gateway, microservices storage depends on data
- At 1,000,000 users (~500,000 RPS):
- Need ~250 API Gateway instances (500,000 / 2,000 RPS per instance)
- Network bandwidth: ~50 Gbps
- Microservices require database scaling and caching layers
Start by explaining the role of the API Gateway as a single entry point. Discuss how it simplifies client interactions but can become a bottleneck. Then, outline scaling steps: horizontal scaling, caching, rate limiting, and microservices partitioning. Always justify why each step is needed based on traffic growth.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas and implement caching to reduce direct database load before scaling vertically or sharding.