0
0
Microservicessystem_design~10 mins

Popular gateways (Kong, AWS API Gateway, Nginx) in Microservices - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Popular gateways (Kong, AWS API Gateway, Nginx)
Growth Table: Popular API Gateways
Users / TrafficWhat Changes?Gateway LoadLatency ImpactSecurity & Features
100 usersBasic routing, simple authSingle gateway instance handles trafficLow latency, minimal overheadBasic rate limiting, logging
10,000 usersIncreased requests, more auth checksMultiple gateway instances behind load balancerLatency slightly increases due to processingAdvanced rate limiting, caching enabled
1,000,000 usersHigh concurrency, complex routing rulesHorizontal scaling of gateways, caching layers addedLatency managed with caching and optimized configsSecurity policies, JWT validation, throttling
100,000,000 usersMassive traffic, global distributionMulti-region gateway clusters, CDN integrationLatency minimized via edge caching and CDNsWAF, DDoS protection, advanced analytics
First Bottleneck

The first bottleneck is usually the API gateway server CPU and memory. As traffic grows, the gateway must process authentication, routing, rate limiting, and logging for every request. This processing can overwhelm a single instance, causing increased latency and dropped requests.

Scaling Solutions
  • Horizontal Scaling: Add more gateway instances behind a load balancer to distribute traffic.
  • Caching: Use response caching to reduce repeated processing for the same requests.
  • Rate Limiting: Protect backend services by limiting requests per user or IP.
  • Offload SSL/TLS: Terminate SSL at load balancer or CDN to reduce gateway CPU load.
  • Use CDN: For static content and some API responses, reduce load on gateways.
  • Sharding: Route traffic based on user segments or regions to different gateway clusters.
Back-of-Envelope Cost Analysis
  • At 1 million users, assuming 1 request per second each, gateways handle ~1 million RPS.
  • A single gateway instance handles ~3000 RPS; need ~350 instances for 1M RPS.
  • Storage for logs: 1M RPS * 100 bytes/log * 3600 seconds = ~360 GB/hour.
  • Bandwidth: 1M RPS * 1 KB/request = ~1 GB/s (~8 Gbps network).
  • Costs rise with instances, bandwidth, and storage for logs and metrics.
Interview Tip

Start by identifying the main components and their limits. Discuss how traffic growth affects each part, especially the gateway. Then propose clear scaling steps: horizontal scaling, caching, and offloading work. Always mention trade-offs like cost and complexity.

Self Check

Your API gateway handles 3000 requests per second. Traffic grows 10x to 30,000 RPS. What do you do first?

Answer: Add more gateway instances behind a load balancer to distribute the load horizontally. This prevents CPU and memory overload on a single instance and maintains low latency.

Key Result
API gateways first hit CPU/memory limits as traffic grows; horizontal scaling and caching are key to handle millions of requests per second.