Bird
Raised Fist0
HLDsystem_design~10 mins

Design a rate limiter in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Design a rate limiter
Growth Table: Rate Limiter Scaling
UsersRequests per Second (RPS)Storage NeedsLatency ImpactSystem Changes
100 users~1,000 RPSMinimal (in-memory counters)Low latencySingle server with in-memory rate limiting
10,000 users~100,000 RPSModerate (distributed cache)Low to moderate latencyDistributed cache (e.g., Redis cluster), load balancer
1,000,000 users~10,000,000 RPSHigh (sharded cache and DB)Moderate latencySharded caches, multiple rate limiter instances, consistent hashing
100,000,000 users~1,000,000,000 RPSVery high (multi-region, sharded storage)Higher latency possibleGlobal distributed rate limiting, hierarchical limits, CDN edge enforcement
First Bottleneck

The first bottleneck is the storage and update of counters that track user requests. At low scale, in-memory counters on a single server work well. As users and requests grow, the rate limiter's data store (like Redis or database) becomes the bottleneck because it must handle many read/write operations per second with low latency.

Scaling Solutions
  • Horizontal scaling: Add more rate limiter instances behind a load balancer to distribute traffic.
  • Distributed caching: Use Redis clusters or similar to store counters with fast access.
  • Sharding: Partition counters by user ID or API key to spread load across multiple nodes.
  • Token bucket or leaky bucket algorithms: Use efficient algorithms to reduce storage and computation.
  • Local caching with periodic sync: Cache counters locally and sync with central store to reduce writes.
  • CDN edge enforcement: For global scale, enforce limits closer to users to reduce central load.
Back-of-Envelope Cost Analysis
  • At 10,000 users with 10 RPS each = 100,000 RPS total.
  • Each request requires 1 read + 1 write to counter -> 200,000 ops/sec on data store.
  • Redis single instance handles ~100,000 ops/sec -> need 2+ Redis nodes or cluster.
  • Storage: counters are small (few bytes each), but millions of users require efficient memory use.
  • Network bandwidth: assuming 1 KB per request metadata, 100,000 RPS = ~100 MB/s bandwidth.
Interview Tip

Start by explaining the rate limiter's purpose and basic algorithm (fixed window, sliding window, token bucket). Then discuss expected traffic and identify bottlenecks. Propose scaling solutions step-by-step, focusing on data store limits and latency. Mention trade-offs like accuracy vs. performance. Finally, consider global scale and edge enforcement.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas and implement caching to reduce database load. Also, consider sharding counters or moving counters to a faster in-memory store like Redis to handle increased QPS.

Key Result
The main bottleneck in a rate limiter is the fast, frequent update of counters in the data store. Scaling requires distributing counters across multiple nodes, using caching, and efficient algorithms to maintain low latency under high request rates.