Design: Rate Limiting System
Design the core rate limiting algorithms and their integration in a distributed API gateway or service. Out of scope: detailed user authentication, billing, or analytics.
Functional Requirements
FR1: Limit the number of requests a user or client can make in a given time window
FR2: Support burst traffic without dropping all requests immediately
FR3: Prevent system overload by smoothing request rates
FR4: Provide configurable limits per user or API key
FR5: Handle millions of requests per second across distributed servers
FR6: Ensure fairness and prevent abuse
Non-Functional Requirements
NFR1: Latency impact on request processing must be minimal (p99 < 10ms)
NFR2: System must be highly available (99.9% uptime)
NFR3: Rate limits must be consistent across distributed instances
NFR4: Support horizontal scaling to handle increasing traffic
NFR5: Memory usage per user/client must be efficient