0
0
Redisquery~15 mins

Master-replica architecture in Redis - Deep Dive

Choose your learning style9 modes available
Overview - Master-replica architecture
What is it?
Master-replica architecture is a way to organize data storage where one main database (the master) handles all writes and updates, while one or more copies (replicas) keep synchronized copies of the data. The replicas mainly serve read requests to reduce the load on the master. This setup helps improve performance and availability of the database system.
Why it matters
Without master-replica architecture, a single database would handle all reads and writes, which can slow down applications and create a single point of failure. This architecture allows systems to handle more users smoothly and continue working even if one part fails, making apps faster and more reliable.
Where it fits
Before learning this, you should understand basic database concepts like what a database is and how data is stored. After mastering this, you can explore more advanced topics like distributed databases, sharding, and high availability strategies.
Mental Model
Core Idea
Master-replica architecture splits database roles so one handles writes and others handle reads, keeping data copies synchronized to improve speed and reliability.
Think of it like...
It's like a teacher (master) who writes notes on the board, and students (replicas) copy the notes. The teacher makes changes, and students update their copies to stay in sync. When others want to read the notes, they ask the students instead of the teacher, so the teacher can focus on writing.
┌─────────────┐       ┌─────────────┐
│   Master    │──────▶│  Replica 1  │
│ (writes)   │       │ (reads)     │
└─────────────┘       └─────────────┘
       │                   │
       │                   ▼
       │             ┌─────────────┐
       └────────────▶│  Replica 2  │
                     │ (reads)     │
                     └─────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Master and Replica Roles
🤔
Concept: Introduce the basic roles of master and replica in the architecture.
In master-replica architecture, the master database is the main source of truth. It handles all data changes like adding or updating information. Replicas are copies of the master that only read data. They do not change data themselves but keep updated by copying from the master.
Result
You know that writes go to the master and reads can go to replicas.
Understanding the separation of roles helps grasp why this architecture improves performance and reliability.
2
FoundationHow Data Synchronization Works
🤔
Concept: Explain how replicas stay updated with the master data.
When the master changes data, it sends those changes to replicas. This process is called replication. Replicas apply these changes to keep their data the same as the master. This can happen instantly or with a small delay.
Result
Replicas have nearly the same data as the master, ready to serve read requests.
Knowing replication keeps data consistent is key to trusting replicas for reads.
3
IntermediateBenefits of Master-Replica Setup
🤔Before reading on: do you think master-replica architecture mainly improves write speed or read speed? Commit to your answer.
Concept: Explore why this architecture is used in real systems.
By sending all writes to the master and distributing reads to replicas, the system can handle many more read requests without slowing down. It also provides backup copies of data, so if the master fails, replicas can help recover.
Result
Systems become faster for users and more resilient to failures.
Understanding the performance and reliability benefits explains why this architecture is popular in real-world databases.
4
IntermediateReplication Delay and Its Effects
🤔Before reading on: do you think replicas always have exactly the same data as the master at every moment? Commit to yes or no.
Concept: Introduce the concept of replication lag and its impact.
Sometimes replicas receive updates a little later than the master. This delay is called replication lag. During this time, replicas might show slightly older data than the master. Applications must handle this carefully to avoid showing outdated information.
Result
You understand that replicas may not always be perfectly up-to-date.
Knowing about replication lag helps design applications that avoid confusing users with stale data.
5
IntermediateRead Scaling with Multiple Replicas
🤔
Concept: Show how adding replicas helps handle more read requests.
If one replica gets too busy, you can add more replicas. Each replica can serve read requests independently. This way, many users can read data at the same time without slowing down the system.
Result
The system can support more users reading data simultaneously.
Understanding horizontal scaling through replicas is essential for building scalable applications.
6
AdvancedFailover and High Availability
🤔Before reading on: do you think replicas can become masters automatically if the master fails? Commit to yes or no.
Concept: Explain how systems handle master failures using replicas.
If the master crashes, one replica can be promoted to become the new master. This process is called failover. It helps keep the system running without data loss. However, failover requires careful coordination to avoid conflicts.
Result
You see how master-replica architecture supports continuous operation.
Knowing failover mechanisms is critical for designing reliable production systems.
7
ExpertConsistency Models and Trade-offs
🤔Before reading on: do you think master-replica architecture guarantees that all reads always see the latest writes? Commit to yes or no.
Concept: Discuss the consistency guarantees and their limits in master-replica setups.
Master-replica systems often use eventual consistency, meaning replicas will catch up eventually but may show old data temporarily. Strong consistency (always latest data) is harder to achieve and may reduce performance. Choosing the right balance depends on application needs.
Result
You understand the trade-offs between consistency, availability, and performance.
Grasping consistency trade-offs helps make informed decisions about database design and user experience.
Under the Hood
Underneath, the master logs every data change in a special stream called the replication log. Replicas connect to the master and continuously read this log to apply changes locally. This streaming process uses network communication and buffering to keep replicas updated efficiently. The master handles writes and appends changes, while replicas replay these changes to stay synchronized.
Why designed this way?
This design separates write and read workloads to improve performance and reliability. Early database systems struggled with scaling reads and handling failures. Master-replica architecture was created to distribute load and provide backup copies without complex distributed consensus, balancing simplicity and scalability.
┌─────────────┐
│   Master    │
│  (writes)  │
│  Logs changes ──────────────┐
└─────────────┘               │
                              ▼
                    ┌─────────────────┐
                    │ Replication Log │
                    └─────────────────┘
                              │
          ┌───────────────────┼───────────────────┐
          ▼                   ▼                   ▼
   ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
   │  Replica 1  │     │  Replica 2  │     │  Replica 3  │
   │ (reads)     │     │ (reads)     │     │ (reads)     │
   └─────────────┘     └─────────────┘     └─────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do replicas always have the exact same data as the master at every moment? Commit to yes or no.
Common Belief:Replicas are always perfectly up-to-date copies of the master.
Tap to reveal reality
Reality:Replicas can lag behind the master due to replication delay, showing slightly outdated data temporarily.
Why it matters:Assuming perfect synchronization can cause applications to show stale data or behave incorrectly when reading from replicas.
Quick: Can you write data directly to replicas in master-replica architecture? Commit to yes or no.
Common Belief:You can write data to any replica just like the master.
Tap to reveal reality
Reality:Replicas are read-only and do not accept writes; all writes must go to the master.
Why it matters:Trying to write to replicas can cause errors and data inconsistency.
Quick: Does promoting a replica to master happen automatically without risk? Commit to yes or no.
Common Belief:Failover to a replica is always automatic and risk-free.
Tap to reveal reality
Reality:Failover requires careful coordination to avoid split-brain scenarios where two masters exist, risking data conflicts.
Why it matters:Mismanaging failover can cause data corruption and system downtime.
Quick: Does master-replica architecture guarantee strong consistency for all reads? Commit to yes or no.
Common Belief:All reads from replicas always reflect the latest writes immediately.
Tap to reveal reality
Reality:Most master-replica setups provide eventual consistency, not strong consistency.
Why it matters:Expecting immediate consistency can lead to confusing user experiences and bugs.
Expert Zone
1
Some replicas can be configured as delayed replicas to protect against accidental data loss by intentionally lagging behind the master.
2
Network partitions can cause replicas to lose connection to the master, requiring careful handling to avoid data divergence.
3
Write-heavy workloads can bottleneck on the master, so combining master-replica with sharding is common in large systems.
When NOT to use
Master-replica architecture is not ideal when strict strong consistency is required across all reads and writes instantly. In such cases, distributed consensus systems like Raft or Paxos-based databases are better. Also, for write-heavy workloads that exceed a single master’s capacity, sharding or multi-master setups might be preferred.
Production Patterns
In production, master-replica is used to scale read-heavy applications like caching layers, analytics, and reporting. Automated failover tools monitor master health and promote replicas when needed. Load balancers route read queries to replicas and write queries to the master. Monitoring replication lag and network health is standard practice.
Connections
Eventual Consistency
Master-replica architecture often implements eventual consistency models.
Understanding eventual consistency clarifies why replicas may show stale data temporarily and how systems balance speed with accuracy.
Load Balancing
Master-replica setups use load balancing to distribute read requests across replicas.
Knowing load balancing helps optimize resource use and improve user experience by preventing overload on any single replica.
Human Memory and Recall
Like replicas copying from a master, human memory stores copies of information that may lag or differ slightly.
Recognizing this similarity helps appreciate the challenges of synchronization and consistency in distributed systems.
Common Pitfalls
#1Reading from replicas without handling replication lag.
Wrong approach:SELECT * FROM replica_db WHERE user_id = 123; -- no lag handling
Correct approach:Check replication lag before reading or read from master for critical fresh data.
Root cause:Assuming replicas always have the latest data leads to stale reads.
#2Writing data directly to a replica.
Wrong approach:INSERT INTO replica_db.users (id, name) VALUES (1, 'Alice');
Correct approach:INSERT INTO master_db.users (id, name) VALUES (1, 'Alice');
Root cause:Misunderstanding that replicas are read-only and cannot accept writes.
#3Failing to coordinate failover properly.
Wrong approach:Manually promoting a replica without checking master status or network partitions.
Correct approach:Use automated failover tools that verify master health and coordinate promotion safely.
Root cause:Ignoring the complexity of failover risks split-brain and data conflicts.
Key Takeaways
Master-replica architecture separates write and read workloads to improve database performance and reliability.
Replicas keep copies of the master data by continuously applying changes through replication, but may lag slightly behind.
This architecture enables scaling reads by adding replicas and supports failover to maintain availability.
Understanding replication lag and consistency trade-offs is essential to design correct and user-friendly applications.
Proper failover coordination and read/write routing are critical to avoid data loss and ensure smooth operation.