0
0
HLDsystem_design~12 mins

Rate limiting algorithms (token bucket, leaky bucket) in HLD - Architecture Diagram

Choose your learning style9 modes available
System Overview - Rate limiting algorithms (token bucket, leaky bucket)

This system controls how many requests a user can make to a service in a given time. It uses two common methods: the token bucket and the leaky bucket algorithms. The goal is to prevent overload and ensure fair use by limiting request rates smoothly and predictably.

Architecture Diagram
User
  |
  v
Load Balancer
  |
  v
API Gateway
  |
  +-------------------+
  | Rate Limiter      |
  |  - Token Bucket   |
  |  - Leaky Bucket   |
  +-------------------+
  |
  v
Service
  |
  v
Database

Cache (for tokens/state) <-> Rate Limiter
Components
User
client
Sends requests to the system
Load Balancer
load_balancer
Distributes incoming requests evenly to API Gateway instances
API Gateway
api_gateway
Entry point for requests; forwards to Rate Limiter
Rate Limiter
service
Applies token bucket or leaky bucket algorithm to limit request rate
Cache
cache
Stores token/leaky bucket state for fast access
Service
service
Processes allowed requests
Database
database
Stores persistent data for the service
Request Flow - 10 Hops
UserLoad Balancer
Load BalancerAPI Gateway
API GatewayRate Limiter
Rate LimiterCache
Rate LimiterAPI Gateway
API GatewayService
ServiceDatabase
ServiceAPI Gateway
API GatewayLoad Balancer
Load BalancerUser
Failure Scenario
Component Fails:Cache
Impact:Rate limiter cannot quickly read or update token/leaky bucket state, causing slower rate limit checks or temporary incorrect limits.
Mitigation:Fallback to database for state storage with higher latency; implement retries and circuit breakers to avoid overload.
Architecture Quiz - 3 Questions
Test your understanding
Which component is responsible for enforcing the rate limit algorithms?
AAPI Gateway
BLoad Balancer
CRate Limiter
DCache
Design Principle
This architecture uses a dedicated Rate Limiter service with fast cache access to efficiently enforce rate limits using token bucket and leaky bucket algorithms. The separation of concerns and fallback mechanisms ensure scalability and reliability under load.