Overview - Rate limiting with sliding window

What is it?

Rate limiting with sliding window is a technique to control how many times a user or system can perform an action within a moving time frame. It helps prevent overuse or abuse by counting requests in a recent period that slides forward with time. Unlike fixed windows, it offers smoother limits by considering the exact timing of each request. This method is often used in systems like APIs to keep traffic manageable.

Why it matters

Without rate limiting, systems can get overwhelmed by too many requests at once, causing slowdowns or crashes. Sliding window rate limiting ensures fair use by tracking requests more precisely over time, avoiding sudden bursts that fixed windows might miss. This keeps services reliable and responsive, protecting both users and providers from overload.

Where it fits

Before learning this, you should understand basic rate limiting concepts and how Redis stores data. After this, you can explore advanced rate limiting algorithms, distributed rate limiting, and how to integrate rate limiting into real-world applications.

Mental Model

Core Idea

Sliding window rate limiting counts requests in a continuously moving time frame to allow smooth and fair control over usage.

Think of it like...

Imagine a rolling hourglass that measures how much sand (requests) has passed recently, not just in fixed chunks of time. This way, you always know the exact recent usage, not just in blocks.

┌───────────────────────────────┐
│          Sliding Window        │
│ ┌───────────────┐             │
│ │   Time Line   │             │
│ │───────────────│             │
│ │  | | | | | |  │ ← Requests  │
│ │  ^             │           │
│ │  Sliding window │           │
│ └───────────────┘             │
│ Counts requests in last N sec │
└───────────────────────────────┘

Build-Up - 7 Steps

1

FoundationWhat is rate limiting

Concept: Introduce the basic idea of rate limiting to control how often actions happen.

Rate limiting means setting a maximum number of times a user or system can do something in a certain time. For example, allowing 10 requests per minute means if you send more than 10 requests in one minute, the system will block or delay extra requests.

Result

You understand the need to control usage to keep systems stable.

Understanding the basic purpose of rate limiting helps you see why controlling request rates is essential for system health.

2

FoundationFixed window rate limiting basics

3

IntermediateSliding window concept explained

4

IntermediateImplementing sliding window in Redis

5

IntermediateSliding window algorithm steps

6

AdvancedHandling concurrency and race conditions

7

ExpertOptimizing sliding window for performance

Under the Hood

Sliding window rate limiting uses a data structure (like Redis sorted sets) to store timestamps of each request. When a new request arrives, the system removes timestamps outside the current window and counts the rest. This count determines if the request is allowed. Internally, Redis uses efficient sorted set operations to add, remove, and count timestamps quickly. Atomic operations via Lua scripts or transactions ensure consistency under concurrent requests.

Why designed this way?

Sliding window was designed to fix the burstiness problem of fixed windows, where users could send many requests at window edges. By tracking exact timestamps, it smooths out request counts over time. Redis sorted sets were chosen because they efficiently store and query ordered data with scores, perfect for timestamp management. Atomic scripts prevent race conditions, ensuring accurate limits even with many simultaneous requests.

┌─────────────┐       ┌───────────────┐       ┌───────────────┐
│ New Request │──────▶│ Add timestamp │──────▶│ Remove old     │
└─────────────┘       └───────────────┘       │ timestamps     │
                                               └───────────────┘
                                                      │
                                                      ▼
                                               ┌───────────────┐
                                               │ Count recent  │
                                               │ timestamps    │
                                               └───────────────┘
                                                      │
                                                      ▼
                                               ┌───────────────┐
                                               │ Allow or      │
                                               │ block request │
                                               └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does sliding window rate limiting allow bursts at window edges like fixed windows? Commit yes or no.

Common Belief:Sliding window rate limiting still allows bursts at window edges just like fixed windows.

Tap to reveal reality

Quick: Do you think storing every request timestamp in Redis is always efficient? Commit yes or no.

Common Belief:Storing every request timestamp in Redis is efficient and scalable without limits.

Tap to reveal reality

Quick: Is it safe to run multiple Redis commands separately for sliding window updates under concurrency? Commit yes or no.

Common Belief:Running Redis commands separately for sliding window updates is safe even with many concurrent requests.

Tap to reveal reality

Quick: Does sliding window rate limiting require complex external databases beyond Redis? Commit yes or no.

Common Belief:Sliding window rate limiting always needs complex external databases or services beyond Redis.

Tap to reveal reality

Expert Zone

1

Sliding window accuracy depends on precise timestamp recording; clock skew or delays can affect limits subtly.

2

Choosing the window size balances user experience and system protection; too small causes strict limits, too large delays detection.

3

Lua scripting in Redis not only ensures atomicity but can also bundle multiple logic steps, reducing network overhead.

When NOT to use

Sliding window rate limiting is less suitable when extremely high throughput with minimal latency is required; token bucket or leaky bucket algorithms may be better. Also, for very simple use cases, fixed window limits might suffice. When distributed rate limiting across multiple servers is needed, additional coordination or external systems may be necessary.

Production Patterns

In production, sliding window rate limiting is often combined with user identification keys and Redis clusters for scalability. Systems use Lua scripts to atomically update counts and enforce limits. Monitoring and alerting track rate limit hits to adjust thresholds dynamically. Hybrid approaches mix sliding window with token bucket for burst tolerance.

Connections

Token Bucket Algorithm

Related rate limiting algorithm with different burst handling

Understanding token bucket helps compare how sliding window smooths counts continuously while token bucket allows bursts up to a token capacity.

Cache Expiration

Builds-on Redis key TTL concepts

Knowing how Redis key expiration works helps optimize sliding window by automatically removing old request data, improving performance.

Traffic Shaping in Networking

Similar concept of controlling flow rates over time

Recognizing that rate limiting mirrors traffic shaping in networks reveals how controlling flow prevents overload in both data and requests.

Common Pitfalls

#1Not removing old timestamps, causing unlimited growth

Wrong approach:ZADD user:123 1680000000 request1 ZADD user:123 1680000050 request2 // No removal of old timestamps ZCARD user:123

Correct approach:ZADD user:123 1680000000 request1 ZADD user:123 1680000050 request2 ZREMRANGEBYSCORE user:123 -inf 1679999940 ZCARD user:123

Root cause:Forgetting to remove timestamps outside the sliding window leads to inaccurate counts and memory bloat.

#2Running commands separately causing race conditions

Wrong approach:ZADD user:123 now request ZREMRANGEBYSCORE user:123 -inf window_start ZCARD user:123 // Commands run separately without atomicity

Correct approach:EVAL "redis.call('ZADD', KEYS[1], ARGV[1], ARGV[2]); redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', ARGV[3]); return redis.call('ZCARD', KEYS[1])" 1 user:123 now request window_start

Root cause:Not using atomic Lua scripts or transactions allows concurrent requests to interfere, causing incorrect counts.

#3Setting window size too small causing strict limits

Wrong approach:Limit 5 requests per 1 second window for all users

Correct approach:Limit 100 requests per 60 seconds window for all users

Root cause:Choosing an impractical window size can frustrate users or fail to protect the system effectively.

Key Takeaways

Sliding window rate limiting counts requests over a continuously moving time frame for smooth control.

Redis sorted sets efficiently store and manage request timestamps for sliding window implementations.

Atomic operations via Lua scripts or transactions are essential to avoid race conditions under concurrency.

Optimizations like TTLs and batch removals keep sliding window rate limiting scalable and performant.

Understanding sliding window helps build fair, reliable systems that prevent overload and abuse.