0
0
Microservicessystem_design~10 mins

Authentication at gateway level in Microservices - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Authentication at gateway level
Growth Table: Authentication at Gateway Level
UsersRequests per Second (RPS)Gateway LoadAuth Service LoadLatency ImpactScaling Needs
100 users~10 RPSLow, single gateway instanceLow, single auth instanceNegligibleBasic setup, no scaling needed
10,000 users~1,000 RPSModerate, gateway CPU & memory increaseModerate, auth service CPU & DB queries increaseSmall latency increase possibleStart load balancing gateways, caching tokens
1,000,000 users~100,000 RPSHigh, multiple gateway instances behind LBHigh, auth DB and token validation bottleneckNoticeable latency if no cachingHorizontal scaling, token caching, DB replicas
100,000,000 users~10,000,000 RPSVery high, global distributed gatewaysVery high, sharded auth DB, distributed cacheLatency critical, must optimizeGlobal load balancing, CDN for static tokens, sharding, microservice partitioning
First Bottleneck

The authentication database and token validation service become the first bottleneck as user requests grow. This is because every request at the gateway requires token verification, which involves database lookups or cryptographic operations. The gateway itself can scale horizontally, but the auth service and its database can get overwhelmed by high QPS.

Scaling Solutions
  • Horizontal Scaling: Add more gateway instances behind a load balancer to handle more concurrent connections.
  • Token Caching: Cache validated tokens in a fast in-memory store (e.g., Redis) to reduce DB hits.
  • Read Replicas: Use read replicas for the authentication database to spread read load.
  • Stateless Tokens: Use JWT or similar tokens that can be validated without DB calls.
  • Sharding: Partition the authentication data by user segments to reduce DB contention.
  • Global Distribution: Deploy gateways and caches close to users to reduce latency.
  • Rate Limiting: Protect the auth service from overload by limiting requests per user/IP.
Back-of-Envelope Cost Analysis

Assuming 1 million users generate 100,000 RPS:

  • Each gateway server handles ~5,000 RPS → Need ~20 gateway servers.
  • Auth DB handles ~10,000 QPS max → Need at least 10 read replicas or use stateless tokens.
  • Token cache (Redis) handles ~100,000 ops/sec → Single Redis cluster can suffice.
  • Network bandwidth: 100,000 RPS x 1 KB/request ≈ 100 MB/s (~800 Mbps), requires 1 Gbps network links.
  • Storage: Auth DB stores user credentials and tokens, grows with user base; sharding helps manage size.
Interview Tip

Start by explaining the authentication flow at the gateway. Identify the bottleneck (auth DB and token validation). Discuss scaling the gateway horizontally first, then caching tokens to reduce DB load. Mention stateless tokens to avoid DB calls. Finally, talk about global distribution and rate limiting to handle very large scale.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Implement token caching or switch to stateless tokens (like JWT) to reduce DB queries. Also, add read replicas to distribute DB read load. This reduces pressure on the database before scaling hardware.

Key Result
Authentication at gateway level scales well initially by adding gateway instances, but the authentication database and token validation become bottlenecks at high traffic. Caching tokens and using stateless tokens are key to scaling beyond millions of users.