How to Design Distributed Cache: Key Concepts and Example
To design a
distributed cache, split cached data across multiple nodes to improve speed and scalability, using consistent hashing or sharding to distribute keys. Ensure data consistency with cache invalidation or TTL, and handle node failures with replication or fallback to the main database.Syntax
A distributed cache system typically involves these parts:
- Cache Nodes: Multiple servers storing parts of the cache.
- Consistent Hashing: A method to assign keys to cache nodes evenly.
- Cache Client: The application component that queries the cache.
- Cache Invalidation: Mechanism to keep cache data fresh.
- Replication: Copying data to multiple nodes for fault tolerance.
javascript
class DistributedCache { constructor(nodes) { this.nodes = nodes; // list of cache servers this.hashRing = this.createHashRing(nodes); } createHashRing(nodes) { // Map nodes to points on a hash ring for consistent hashing // Simplified example return nodes; } getNode(key) { // Use consistent hashing to find node for key return this.nodes[key.length % this.nodes.length]; } get(key) { const node = this.getNode(key); return node.get(key); } set(key, value) { const node = this.getNode(key); node.set(key, value); } }
Example
This example shows a simple distributed cache with two nodes using consistent hashing to store and retrieve values.
javascript
class CacheNode { constructor(name) { this.name = name; this.store = new Map(); } get(key) { return this.store.get(key) || null; } set(key, value) { this.store.set(key, value); } } class DistributedCache { constructor(nodes) { this.nodes = nodes; } getNode(key) { // Simple hash: sum char codes mod nodes count const hash = [...key].reduce((acc, c) => acc + c.charCodeAt(0), 0); return this.nodes[hash % this.nodes.length]; } get(key) { const node = this.getNode(key); return node.get(key); } set(key, value) { const node = this.getNode(key); node.set(key, value); } } const nodeA = new CacheNode('NodeA'); const nodeB = new CacheNode('NodeB'); const cache = new DistributedCache([nodeA, nodeB]); cache.set('apple', 'fruit'); cache.set('carrot', 'vegetable'); console.log(cache.get('apple')); console.log(cache.get('carrot')); console.log(cache.get('banana'));
Output
fruit
vegetable
null
Common Pitfalls
Common mistakes when designing distributed caches include:
- Ignoring cache consistency: Not updating or invalidating cache leads to stale data.
- Uneven data distribution: Poor hashing causes some nodes to be overloaded.
- No fault tolerance: Single node failure causes data loss or downtime.
- Over-caching: Caching too much data wastes memory and slows down cache.
Always plan for cache invalidation, use consistent hashing, and replicate data for reliability.
javascript
/* Wrong: Using simple modulo without consistent hashing can cause uneven load */ function getNodeSimple(key, nodes) { return nodes[key.length % nodes.length]; } /* Right: Use consistent hashing library or algorithm to distribute keys evenly */ // Pseudocode: hash key, find closest node on hash ring function getNodeConsistentHash(key, hashRing) { // find node clockwise from key hash return hashRing.find(node => node.hash >= hash(key)) || hashRing[0]; }
Quick Reference
- Consistent Hashing: Distributes keys evenly and minimizes rebalancing.
- Cache Invalidation: Use TTL or event-based invalidation to keep data fresh.
- Replication: Store copies on multiple nodes to handle failures.
- Fallback: On cache miss or failure, query the main database.
- Monitoring: Track cache hit/miss rates and node health.
Key Takeaways
Use consistent hashing to distribute cache keys evenly across nodes.
Implement cache invalidation or TTL to avoid stale data.
Replicate cache data to handle node failures and improve reliability.
Design clients to fallback to the main database on cache misses.
Monitor cache performance and node health regularly.