HLDsystem_design~7 mins

Cache stampede prevention in HLD - System Design Guide

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Problem Statement

When many users request the same data that is missing or expired in the cache, all requests hit the backend database or service simultaneously. This overloads the backend, causing slow responses or crashes, and defeats the purpose of caching.

Solution

Cache stampede prevention limits the number of requests that fetch data from the backend when the cache is empty or expired. It does this by allowing only one request to rebuild the cache while others wait or get stale data, thus reducing backend load and improving response times.

Architecture

User Request

→Cache Layer

↓

Other Users

Wait or Use

This diagram shows user requests checking the cache. If data is missing, one request acquires a lock or token to fetch from backend, while others wait or use stale cache data.

Trade-offs

✓ Pros

→

Prevents backend overload by limiting simultaneous cache misses.

→

Improves overall system stability during high traffic spikes.

→

Reduces latency for most users by serving cached or stale data.

→

Avoids repeated expensive computations or database queries.

✗ Cons

→

Adds complexity with locking or token management in cache layer.

→

Potentially serves stale data to some users during cache rebuild.

→

Requires careful timeout and fallback strategies to avoid deadlocks.

Use when cache misses cause backend overload or slowdowns, especially at scale above thousands of requests per second for the same keys.

Avoid if cache miss rate is very low or backend can handle sudden load spikes without degradation.

Real World Examples

Netflix

Netflix uses cache stampede prevention to avoid overwhelming their recommendation service when popular content metadata expires simultaneously.

Twitter

Twitter applies cache stampede prevention to limit database hits when trending topics cause many users to request the same data.

Shopify

Shopify uses this pattern to protect their product catalog backend during flash sales when cache entries expire and many users request the same product info.

Alternatives

Cache Aside

Clients check cache and fetch from backend on miss without coordination, risking stampedes.

Use when: Use when backend can handle occasional spikes and simplicity is preferred.

Write-Through Cache

Cache is updated synchronously on writes, reducing cache misses but increasing write latency.

Use when: Use when data freshness is critical and write load is manageable.

Probabilistic Early Expiration

Cache entries expire randomly before TTL to spread backend load over time.

Use when: Use when backend can tolerate some load spikes and complexity of locking is undesired.

Summary

Cache stampede prevention stops many requests from hitting the backend simultaneously when cache entries expire.

It uses locking or tokens to allow only one request to refresh the cache while others wait or get stale data.

This pattern improves system stability and response times during traffic spikes but adds some complexity.