How to Design API Rate Limiter: Simple and Scalable Approach
To design an API rate limiter, use
tokens or counters to track requests per user or IP within a time window, rejecting requests that exceed limits. Implement algorithms like fixed window, sliding window, or token bucket for fairness and scalability.Syntax
An API rate limiter typically follows this pattern:
identifier: Unique key for user or client (e.g., user ID, IP address).limit: Maximum allowed requests in a time window.window: Time duration for counting requests (e.g., 1 minute).counter: Tracks number of requests made in the current window.algorithm: Logic to decide if a request is allowed or rejected.
javascript
class RateLimiter { constructor(limit, windowMs) { this.limit = limit; // max requests this.windowMs = windowMs; // time window in ms this.counters = new Map(); // stores request counts } isAllowed(identifier) { const now = Date.now(); if (!this.counters.has(identifier)) { this.counters.set(identifier, { count: 1, startTime: now }); return true; } const data = this.counters.get(identifier); if (now - data.startTime >= this.windowMs) { // Reset window this.counters.set(identifier, { count: 1, startTime: now }); return true; } if (data.count < this.limit) { data.count += 1; return true; } return false; // limit exceeded } }
Example
This example shows a simple fixed window rate limiter that allows 3 requests per 10 seconds per user ID.
javascript
class RateLimiter { constructor(limit, windowMs) { this.limit = limit; this.windowMs = windowMs; this.counters = new Map(); } isAllowed(identifier) { const now = Date.now(); if (!this.counters.has(identifier)) { this.counters.set(identifier, { count: 1, startTime: now }); return true; } const data = this.counters.get(identifier); if (now - data.startTime >= this.windowMs) { this.counters.set(identifier, { count: 1, startTime: now }); return true; } if (data.count < this.limit) { data.count += 1; return true; } return false; } } const limiter = new RateLimiter(3, 10000); // 3 requests per 10 seconds const user = 'user123'; console.log(limiter.isAllowed(user)); // true console.log(limiter.isAllowed(user)); // true console.log(limiter.isAllowed(user)); // true console.log(limiter.isAllowed(user)); // false // After 10 seconds, requests reset setTimeout(() => { console.log(limiter.isAllowed(user)); // true }, 11000);
Output
true
true
true
false
true
Common Pitfalls
- Using fixed windows causes bursts: Requests at window edges can exceed limits unfairly.
- Not handling distributed systems: Local counters fail when multiple servers handle requests; use centralized stores like Redis.
- Ignoring user identification: Rate limiting by IP alone can block many users behind proxies.
- Not cleaning old counters: Memory leaks if counters are never removed after inactivity.
none
/* Wrong: Fixed window causes bursts */ // User sends 3 requests at end of window and 3 at start of next window, total 6 in short time /* Right: Sliding window or token bucket smooths requests */ // Use timestamps or tokens to allow steady request flow
Quick Reference
Summary tips for API rate limiter design:
- Choose algorithm based on fairness and complexity: fixed window (simple), sliding window (fairer), token bucket (flexible).
- Use a centralized store (Redis, Memcached) for distributed systems.
- Identify clients uniquely (user ID, API key) rather than just IP.
- Set sensible limits and windows based on API usage patterns.
- Implement cleanup for expired counters to save memory.
Key Takeaways
Use counters or tokens to track requests per user within a time window.
Select rate limiting algorithms like fixed window, sliding window, or token bucket based on needs.
For distributed APIs, store counters centrally to keep limits consistent.
Identify clients uniquely to avoid blocking multiple users behind shared IPs.
Clean up expired counters to prevent memory leaks and maintain performance.