0
0
Redisquery~15 mins

Adding and removing nodes in Redis - Deep Dive

Choose your learning style9 modes available
Overview - Adding and removing nodes
What is it?
Adding and removing nodes in Redis means changing the number of servers (nodes) in a Redis cluster. Nodes are individual Redis instances that work together to store and manage data. Adding nodes helps the cluster grow and handle more data or traffic. Removing nodes shrinks the cluster, often for maintenance or cost-saving.
Why it matters
Without the ability to add or remove nodes, a Redis cluster would be stuck with a fixed size. This limits how much data it can store and how well it can handle many users at once. Being able to change the number of nodes lets Redis adapt to real-world needs, like growing a website or fixing broken servers without downtime.
Where it fits
Before learning this, you should understand what a Redis cluster is and how data is split across nodes (sharding). After this, you can learn about cluster rebalancing and failover to keep the cluster healthy and efficient.
Mental Model
Core Idea
Adding or removing nodes changes the Redis cluster size and requires moving data to keep everything balanced and available.
Think of it like...
Imagine a team of delivery drivers (nodes) sharing packages (data). Adding a driver means some packages get reassigned to the new driver to share the work. Removing a driver means their packages must be given to others so no package is lost.
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│   Node 1     │◄──────│   Node 2     │──────►│   Node 3     │
│  (Redis)    │       │  (Redis)    │       │  (Redis)    │
└───────────────┘       └───────────────┘       └───────────────┘
       ▲                      ▲                      ▲
       │                      │                      │
   Data slots             Data slots             Data slots
   assigned               assigned               assigned

Adding a node inserts a new box and redistributes data slots.
Removing a node removes a box and moves its data slots to others.
Build-Up - 7 Steps
1
FoundationWhat is a Redis node?
🤔
Concept: A Redis node is a single Redis server instance that stores part of the data in a cluster.
A Redis cluster is made of multiple nodes. Each node holds a subset of the total data, called hash slots. The cluster splits 16,384 hash slots among all nodes. Each node is responsible for some slots and the keys that map to them.
Result
You understand that a node is a building block of a Redis cluster, holding some data.
Knowing what a node is helps you see why adding or removing nodes affects data distribution.
2
FoundationHow data is distributed in Redis cluster
🤔
Concept: Redis uses hash slots to split data evenly across nodes.
Redis assigns each key to one of 16,384 hash slots using a hash function. These slots are divided among nodes. When you add or remove nodes, the slots must be reassigned to keep data balanced.
Result
You see that data is not randomly placed but carefully split by hash slots.
Understanding hash slots is key to grasping why adding/removing nodes requires moving data.
3
IntermediateAdding a node to a Redis cluster
🤔Before reading on: do you think adding a node automatically moves data, or does it require manual steps? Commit to your answer.
Concept: Adding a node means assigning it some hash slots and moving keys from existing nodes to it.
To add a node, you start a new Redis instance and join it to the cluster. Then you use the 'reshard' process to move some hash slots from existing nodes to the new one. This moves the keys for those slots to the new node, balancing the cluster.
Result
The cluster grows, and data is redistributed so the new node holds part of the data.
Knowing that adding nodes involves moving data helps you plan for the time and resources needed during scaling.
4
IntermediateRemoving a node from a Redis cluster
🤔Before reading on: do you think removing a node deletes its data or moves it elsewhere? Commit to your answer.
Concept: Removing a node requires moving its hash slots and keys to other nodes before safely removing it.
To remove a node, you first migrate its hash slots to other nodes using resharding. Once all data is moved, you can safely remove the node from the cluster. This prevents data loss and keeps the cluster balanced.
Result
The cluster shrinks, but all data remains available on remaining nodes.
Understanding that data must be moved before removal prevents accidental data loss.
5
IntermediateResharding: moving hash slots between nodes
🤔Before reading on: do you think resharding can happen automatically or only manually? Commit to your answer.
Concept: Resharding is the process of moving hash slots and their keys between nodes to balance data.
Redis provides commands and tools to reshard data. You specify which slots to move and to which node. Redis then migrates keys for those slots, updating cluster state. This is needed when adding or removing nodes.
Result
Data is moved smoothly between nodes without downtime.
Knowing how resharding works helps you manage cluster size changes safely.
6
AdvancedHandling cluster state during node changes
🤔Before reading on: do you think cluster state updates happen instantly or require coordination? Commit to your answer.
Concept: The cluster must update its internal state to reflect new node assignments and maintain consistency.
When nodes are added or removed, Redis nodes communicate to update slot assignments and cluster topology. This coordination ensures clients can find keys correctly. The cluster uses gossip protocol and configuration epochs to manage changes.
Result
Clients see a consistent view of the cluster and can route commands properly.
Understanding cluster state coordination explains how Redis avoids errors during node changes.
7
ExpertSurprises in node removal and failover
🤔Before reading on: do you think removing a node with replicas affects failover? Commit to your answer.
Concept: Removing nodes with replicas requires careful handling to avoid losing redundancy and causing failover issues.
If a node has replicas, you must promote a replica before removal or reassign replicas to other masters. Removing a master node without handling replicas can cause data unavailability or split-brain scenarios. Tools like redis-trib or redis-cli help manage this safely.
Result
Cluster remains highly available and consistent after node removal.
Knowing replica handling during node removal prevents costly downtime and data loss in production.
Under the Hood
Redis cluster divides data into 16,384 hash slots. Each node owns some slots. Adding or removing nodes triggers slot migration, where keys are copied from one node to another. Nodes communicate using a gossip protocol to share cluster state and slot ownership. Configuration epochs track changes to avoid conflicts. During migration, keys are copied and clients are redirected to the new owner.
Why designed this way?
Redis was designed for high availability and scalability. Splitting data into hash slots allows easy distribution and rebalancing. Gossip protocol enables decentralized cluster state sharing without a single point of failure. This design balances simplicity, speed, and fault tolerance. Alternatives like centralized management were rejected to avoid bottlenecks.
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│   Node A     │◄──────│   Node B     │──────►│   Node C     │
│ Slots 0-5000 │       │ Slots 5001-10000│     │ Slots 10001-16383│
└───────────────┘       └───────────────┘       └───────────────┘
       │                      │                      │
       │  Migrate slots 4000-5000 to new node D       │
       ▼                      ▼                      ▼
┌───────────────┐
│   Node D     │
│ Slots 4000-5000│
└───────────────┘

Nodes update cluster state via gossip to reflect new slot ownership.
Myth Busters - 4 Common Misconceptions
Quick: Does removing a node delete its data permanently? Commit yes or no.
Common Belief:Removing a node deletes all its data immediately.
Tap to reveal reality
Reality:Data is migrated to other nodes before removal, so no data is lost.
Why it matters:Believing data is lost can cause panic or improper removal steps risking data loss.
Quick: Can you add a node and have it start serving data instantly without resharding? Commit yes or no.
Common Belief:Adding a node automatically balances data without manual intervention.
Tap to reveal reality
Reality:You must manually reshard to move hash slots and data to the new node.
Why it matters:Assuming automatic balancing leads to unbalanced clusters and performance issues.
Quick: Does removing a master node with replicas require special steps? Commit yes or no.
Common Belief:You can remove any node anytime without extra steps.
Tap to reveal reality
Reality:You must handle replicas carefully to avoid failover problems.
Why it matters:Ignoring replicas can cause downtime or data inconsistency.
Quick: Is cluster state updated instantly and globally when nodes change? Commit yes or no.
Common Belief:Cluster state changes propagate instantly to all nodes.
Tap to reveal reality
Reality:State updates use gossip protocol and take time to reach all nodes.
Why it matters:Expecting instant updates can cause confusion and errors during transitions.
Expert Zone
1
When resharding, migrating slots in small batches reduces impact on cluster performance and client latency.
2
Replica promotion before master removal must consider replication lag to avoid data loss.
3
Cluster configuration epochs prevent conflicting slot ownership but can cause temporary split-brain if mismanaged.
When NOT to use
Avoid adding or removing nodes frequently in very small clusters; instead, plan capacity ahead. For very large clusters, consider automated cluster management tools or managed Redis services that handle scaling transparently.
Production Patterns
In production, teams use rolling resharding during low traffic periods to add nodes. They automate replica promotion before master removal. Monitoring tools track slot distribution and cluster health to trigger scaling events.
Connections
Distributed Hash Table (DHT)
Redis cluster's hash slot distribution is a form of DHT used in distributed systems.
Understanding DHTs from peer-to-peer networks helps grasp how Redis distributes and locates data efficiently.
Load Balancing
Adding/removing nodes and resharding is similar to load balancing across servers.
Knowing load balancing principles clarifies why data must be moved to keep workload even.
Teamwork and Task Sharing
Adding/removing nodes is like changing team members and redistributing tasks.
This connection helps appreciate the importance of smooth transitions to avoid dropped work or confusion.
Common Pitfalls
#1Removing a node without migrating its slots first.
Wrong approach:redis-cli --cluster del-node 127.0.0.1:7001
Correct approach:Use redis-cli --cluster reshard to move slots from node, then del-node after migration.
Root cause:Misunderstanding that data must be moved before node removal to avoid data loss.
#2Adding a node but not resharding slots to it.
Wrong approach:Start new Redis node and join cluster, then do nothing else.
Correct approach:After joining, run redis-cli --cluster reshard to assign slots to the new node.
Root cause:Assuming new nodes automatically get data without manual slot migration.
#3Removing a master node with replicas without promoting replicas.
Wrong approach:Directly delete master node without handling replicas.
Correct approach:Promote a replica to master or reassign replicas before removing the master node.
Root cause:Ignoring replica roles and failover requirements.
Key Takeaways
Redis clusters split data into hash slots distributed across nodes to scale storage and performance.
Adding or removing nodes requires moving hash slots and keys to keep data balanced and available.
Resharding is the process of migrating hash slots between nodes and must be done carefully to avoid downtime.
Cluster state updates use gossip protocol and configuration epochs to maintain consistency during changes.
Handling replicas properly during node removal is critical to prevent data loss and maintain high availability.