0
0
Microservicessystem_design~10 mins

Why API gateways unify service access in Microservices - Scalability Evidence

Choose your learning style9 modes available
Scalability Analysis - Why API gateways unify service access
Growth Table: User Scale and System Changes
UsersSystem Changes
100 usersDirect service calls; simple routing; minimal latency; no gateway needed
10,000 usersMultiple microservices; need unified access; increased request volume; API gateway introduced for routing and security
1,000,000 usersHigh concurrency; gateway handles load balancing, authentication, rate limiting; caching added; gateway scales horizontally
100,000,000 usersGlobal distribution; multiple API gateway clusters; CDN integration; advanced traffic shaping; microservices sharded; gateway handles failover and analytics
First Bottleneck: API Gateway Throughput and Latency

As user requests grow, the API gateway becomes the first bottleneck because it handles all incoming traffic to multiple microservices. It must route, authenticate, and apply policies for every request. Without scaling, the gateway's CPU, memory, or network bandwidth limits will cause increased latency and dropped requests.

Scaling Solutions for API Gateway Bottleneck
  • Horizontal Scaling: Add more gateway instances behind a load balancer to distribute traffic.
  • Caching: Cache common responses at the gateway to reduce backend calls.
  • Rate Limiting: Protect backend services by limiting requests per user or IP.
  • Edge Deployment: Deploy gateways closer to users (regional clusters) to reduce latency.
  • Offload SSL/TLS: Terminate encryption at the gateway to reduce backend load.
  • Use CDN: For static content, reduce gateway load by serving from CDN.
Back-of-Envelope Cost Analysis
  • At 1M users, assuming 1 request per second each = 1M RPS total.
  • One gateway instance handles ~5,000 RPS → need ~200 instances.
  • Network bandwidth per gateway: 1 Gbps (~125 MB/s) can handle ~10,000 requests of 10 KB each per second.
  • Storage at gateway is minimal (caching few GBs), but backend storage grows with data.
  • Cost scales with number of gateway instances, bandwidth, and caching infrastructure.
Interview Tip: Structuring Scalability Discussion

Start by explaining the role of the API gateway in unifying access. Discuss how traffic growth impacts the gateway first. Identify bottlenecks like CPU, memory, and network. Propose scaling solutions like horizontal scaling and caching. Mention trade-offs and monitoring needs. Use clear examples and numbers to support your points.

Self Check Question

Your API gateway handles 1,000 requests per second. Traffic grows 10x to 10,000 RPS. What do you do first and why?

Key Result
API gateways unify access but become the first bottleneck as traffic grows; horizontal scaling and caching at the gateway are key to handle millions of users efficiently.