HldHow-ToBeginner · 4 min read

How to Design API Rate Limiter: Simple and Scalable Approach

To design an API rate limiter, use tokens or counters to track requests per user or IP within a time window, rejecting requests that exceed limits. Implement algorithms like fixed window, sliding window, or token bucket for fairness and scalability.

📐

Syntax

An API rate limiter typically follows this pattern:

identifier: Unique key for user or client (e.g., user ID, IP address).
limit: Maximum allowed requests in a time window.
window: Time duration for counting requests (e.g., 1 minute).
counter: Tracks number of requests made in the current window.
algorithm: Logic to decide if a request is allowed or rejected.

javascript

class RateLimiter {
    constructor(limit, windowMs) {
        this.limit = limit; // max requests
        this.windowMs = windowMs; // time window in ms
        this.counters = new Map(); // stores request counts
    }

    isAllowed(identifier) {
        const now = Date.now();
        if (!this.counters.has(identifier)) {
            this.counters.set(identifier, { count: 1, startTime: now });
            return true;
        }

        const data = this.counters.get(identifier);
        if (now - data.startTime >= this.windowMs) {
            // Reset window
            this.counters.set(identifier, { count: 1, startTime: now });
            return true;
        }

        if (data.count < this.limit) {
            data.count += 1;
            return true;
        }

        return false; // limit exceeded
    }
}

💻

Example

This example shows a simple fixed window rate limiter that allows 3 requests per 10 seconds per user ID.

javascript

class RateLimiter {
    constructor(limit, windowMs) {
        this.limit = limit;
        this.windowMs = windowMs;
        this.counters = new Map();
    }

    isAllowed(identifier) {
        const now = Date.now();
        if (!this.counters.has(identifier)) {
            this.counters.set(identifier, { count: 1, startTime: now });
            return true;
        }

        const data = this.counters.get(identifier);
        if (now - data.startTime >= this.windowMs) {
            this.counters.set(identifier, { count: 1, startTime: now });
            return true;
        }

        if (data.count < this.limit) {
            data.count += 1;
            return true;
        }

        return false;
    }
}

const limiter = new RateLimiter(3, 10000); // 3 requests per 10 seconds
const user = 'user123';

console.log(limiter.isAllowed(user)); // true
console.log(limiter.isAllowed(user)); // true
console.log(limiter.isAllowed(user)); // true
console.log(limiter.isAllowed(user)); // false

// After 10 seconds, requests reset
setTimeout(() => {
    console.log(limiter.isAllowed(user)); // true
}, 11000);

Output

true true true false true

⚠️

Common Pitfalls

Using fixed windows causes bursts: Requests at window edges can exceed limits unfairly.
Not handling distributed systems: Local counters fail when multiple servers handle requests; use centralized stores like Redis.
Ignoring user identification: Rate limiting by IP alone can block many users behind proxies.
Not cleaning old counters: Memory leaks if counters are never removed after inactivity.

none

/* Wrong: Fixed window causes bursts */
// User sends 3 requests at end of window and 3 at start of next window, total 6 in short time

/* Right: Sliding window or token bucket smooths requests */
// Use timestamps or tokens to allow steady request flow

📊

Quick Reference

Summary tips for API rate limiter design:

Choose algorithm based on fairness and complexity: fixed window (simple), sliding window (fairer), token bucket (flexible).
Use a centralized store (Redis, Memcached) for distributed systems.
Identify clients uniquely (user ID, API key) rather than just IP.
Set sensible limits and windows based on API usage patterns.
Implement cleanup for expired counters to save memory.

✅

Key Takeaways

Use counters or tokens to track requests per user within a time window.

Select rate limiting algorithms like fixed window, sliding window, or token bucket based on needs.

For distributed APIs, store counters centrally to keep limits consistent.

Identify clients uniquely to avoid blocking multiple users behind shared IPs.

Clean up expired counters to prevent memory leaks and maintain performance.