HLDsystem_design~7 mins

Distributed caching (Redis, Memcached) in HLD - System Design Guide

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Problem Statement

When a single cache server is used, it becomes a bottleneck under high load, causing slow response times and potential downtime if the cache server fails. Also, a single cache cannot handle the large data volume needed for scalable applications, leading to cache misses and increased load on the database.

Solution

Distributed caching spreads cached data across multiple cache servers, allowing the system to handle more requests and larger data volumes. It uses consistent hashing or partitioning to route requests to the correct cache node, ensuring fast data retrieval and fault tolerance by replicating data or failing over to other nodes.

Architecture

Client App

↓

Cache Client

↓

Cache 1

(Redis)

↓

Database

This diagram shows a client application using a cache client with consistent hashing to distribute requests across multiple Redis cache nodes, which in turn reduce load on the database.

Trade-offs

✓ Pros

→

Improves cache capacity and throughput by distributing data across multiple nodes.

→

Increases fault tolerance; if one cache node fails, others can serve requests.

→

Reduces latency by routing requests to the correct cache node directly.

→

Enables horizontal scaling of cache layer as demand grows.

✗ Cons

→

Adds complexity in managing cache consistency and invalidation across nodes.

→

Requires careful partitioning and hashing to avoid uneven data distribution (hot spots).

→

Replication and failover mechanisms increase operational overhead.

Use distributed caching when your application has high read traffic exceeding thousands of requests per second and requires low latency with large cache data sets that cannot fit on a single server.

Avoid distributed caching if your cache data is small enough to fit on a single server or if your read traffic is under 1,000 requests per second, where added complexity outweighs benefits.

Real World Examples

Netflix

Uses distributed Redis caching to store user session data and metadata, enabling fast access and fault tolerance across global edge locations.

Twitter

Employs Memcached clusters to cache timelines and user data, reducing database load and improving response times during traffic spikes.

Uber

Uses distributed caching to store frequently accessed geolocation and pricing data, ensuring low latency for ride matching.

Alternatives

Local in-memory caching

Caches data only on the application server's memory without sharing across nodes.

Use when: When the application is small-scale or stateful per instance, and data consistency across nodes is not critical.

CDN caching

Caches static content closer to users at edge locations rather than application data in memory.

Use when: When the main goal is to speed up delivery of static assets like images, videos, or web pages.

Database caching with read replicas

Uses database replicas to distribute read load instead of a separate cache layer.

Use when: When data freshness is critical and caching complexity is to be minimized.

Summary

Distributed caching spreads cached data across multiple servers to handle high traffic and large data volumes.

It improves fault tolerance and reduces latency by routing requests to the correct cache node.

However, it adds complexity in managing data consistency and requires careful partitioning.