0
0
HLDsystem_design~10 mins

API gateway concept in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - API gateway concept
Growth Table: API Gateway Scaling
Users/TrafficAPI Gateway LoadLatency ImpactSecurity & RoutingInfrastructure Changes
100 usersLow requests per second (RPS), single instance handles wellMinimal latency, simple routingBasic authentication and rate limitingSingle server or cloud function
10,000 usersModerate RPS, single instance may start to saturateLatency slightly increases, need optimized routingEnhanced security policies, throttlingLoad balancer added, possible multiple instances
1 million usersHigh RPS, single instance insufficientLatency sensitive, need caching and optimized pathsAdvanced security (OAuth, JWT), API versioningHorizontal scaling, distributed gateway cluster
100 million usersVery high RPS, requires global distributionLatency critical, edge caching and CDN integrationMulti-tenant security, dynamic routing, throttlingGlobal load balancers, multi-region clusters, CDN
First Bottleneck

The API gateway server CPU and memory become the first bottleneck as traffic grows. It must handle all incoming requests, perform routing, authentication, rate limiting, and sometimes transformation. At moderate to high traffic, a single gateway instance cannot keep up, causing increased latency and dropped requests.

Scaling Solutions
  • Horizontal Scaling: Add multiple API gateway instances behind a load balancer to distribute traffic.
  • Caching: Use response caching at the gateway or integrate with CDN to reduce backend load.
  • Rate Limiting & Throttling: Protect backend services by limiting requests per user or IP.
  • Edge Deployment: Deploy gateways closer to users globally to reduce latency.
  • Service Mesh Integration: For internal microservices, use service mesh to offload routing and security.
  • API Versioning & Routing Optimization: Efficient routing rules reduce processing time.
Back-of-Envelope Cost Analysis
  • At 1 million users, assuming 1 request per second per user, API gateway handles ~1 million RPS.
  • One server handles ~3000-5000 concurrent connections; need ~200-300 gateway instances.
  • Network bandwidth: 1 Gbps ~125 MB/s; estimate average request size to calculate total bandwidth.
  • Storage is minimal at gateway level, mostly logs and cache; scale storage for logs accordingly.
  • Cost grows with number of instances, bandwidth, and caching infrastructure.
Interview Tip

Start by explaining the API gateway role and its responsibilities. Discuss traffic growth impact on CPU, memory, and network. Identify the first bottleneck clearly. Then propose scaling solutions step-by-step: horizontal scaling, caching, edge deployment. Mention trade-offs and cost implications. Use real numbers to show understanding.

Self Check

Your API gateway handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add more API gateway instances behind a load balancer to horizontally scale and distribute the increased load, preventing CPU/memory saturation and reducing latency.

Key Result
API gateway first breaks at CPU/memory under high request load; horizontal scaling and caching are key to handle millions of users efficiently.