0
0
HldConceptBeginner · 3 min read

What is Consistent Hashing: Explanation and Use Cases

Consistent hashing is a technique to distribute data across multiple servers so that when servers are added or removed, only a small portion of data needs to move. It uses a hash function to assign both servers and data to points on a circle, minimizing disruption and improving scalability.
⚙️

How It Works

Imagine a circular ring like a clock face. Both servers and data items are placed on this ring using a hash function that converts their IDs into positions on the circle. Each data item is stored on the first server found by moving clockwise from its position.

This way, when a server joins or leaves, only the data items near that server on the ring need to move, not all data. This is like adding or removing a seat at a round table; only the neighbors next to that seat are affected.

This approach avoids the problem of reshuffling all data, which happens in simple hashing methods, making consistent hashing ideal for systems that change often.

💻

Example

This Python example shows a simple consistent hashing ring with servers and data keys. It finds which server each key maps to.

python
import hashlib

class ConsistentHashRing:
    def __init__(self, nodes=None):
        self.ring = dict()
        self.sorted_keys = []
        if nodes:
            for node in nodes:
                self.add_node(node)

    def _hash(self, key):
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, node):
        key = self._hash(node)
        self.ring[key] = node
        self.sorted_keys.append(key)
        self.sorted_keys.sort()

    def remove_node(self, node):
        key = self._hash(node)
        del self.ring[key]
        self.sorted_keys.remove(key)

    def get_node(self, key_str):
        key = self._hash(key_str)
        for node_key in self.sorted_keys:
            if key <= node_key:
                return self.ring[node_key]
        return self.ring[self.sorted_keys[0]]

# Setup ring with 3 servers
ring = ConsistentHashRing(['ServerA', 'ServerB', 'ServerC'])

# Map some data keys
keys = ['apple', 'banana', 'cherry', 'date', 'fig']
for key in keys:
    print(f"{key} -> {ring.get_node(key)}")
Output
apple -> ServerB banana -> ServerC cherry -> ServerA date -> ServerA fig -> ServerB
🎯

When to Use

Use consistent hashing when you have a distributed system with many servers that store data, and these servers can be added or removed often. It helps keep the system balanced and reduces the cost of moving data.

Common real-world uses include:

  • Distributed caches like Memcached or Redis clusters
  • Load balancing in web servers
  • Distributed databases and storage systems

It is especially useful when you want to avoid downtime or heavy data reshuffling during scaling.

Key Points

  • Consistent hashing maps servers and data to a circle using a hash function.
  • Only data near a changed server moves when servers join or leave.
  • It improves scalability and fault tolerance in distributed systems.
  • Widely used in caching, load balancing, and distributed storage.

Key Takeaways

Consistent hashing minimizes data movement when servers change in a distributed system.
It uses a hash ring to assign both servers and data to positions for balanced distribution.
Ideal for scalable systems like caches, load balancers, and distributed databases.
Helps maintain system stability and performance during scaling or failures.