Overview - Read replicas for performance

What is it?

Read replicas are copies of a database that handle read requests to reduce the load on the main database. They help improve performance by spreading out the work of reading data. The main database still handles writing data, while replicas keep copies updated asynchronously. This setup allows many users to read data quickly without slowing down writes.

Why it matters

Without read replicas, all users would send their read and write requests to the same database, causing delays and slow responses when many people use the system. Read replicas solve this by sharing the reading work, making apps faster and more reliable. This means better user experience and less chance of crashes during busy times.

Where it fits

Before learning about read replicas, you should understand basic databases and how they handle reading and writing data. After this, you can learn about advanced database scaling, caching, and multi-region setups to improve performance and availability further.

Mental Model

Core Idea

Read replicas are like extra helpers who copy the main database's information so many people can read data quickly without crowding the main source.

Think of it like...

Imagine a popular library with one main book. Instead of everyone crowding around that one book, the library makes several copies and places them on different tables. Readers can pick any copy to read, so no one waits too long.

Main Database (Writes) ──┐
                         │
                         ▼
                  ┌─────────────┐
                  │ Read Replica│
                  ├─────────────┤
                  │ Read Replica│
                  ├─────────────┤
                  │ Read Replica│
                  └─────────────┘

Users ──────────────▶ Read Replicas (handle reads)
Users ──────────────▶ Main Database (handle writes)

Build-Up - 7 Steps

1

FoundationUnderstanding database read and write roles

Concept: Databases handle two main tasks: writing new data and reading existing data.

When you save information, the database writes it. When you look up information, the database reads it. Usually, the same database does both jobs, but this can slow things down if many people use it at once.

Result

You know that reading and writing are different tasks that can affect database speed.

Understanding the difference between reading and writing is key to knowing why splitting these tasks can improve performance.

2

FoundationWhat is a read replica in simple terms

3

IntermediateHow read replicas stay updated asynchronously

4

IntermediateBenefits of using read replicas for performance

5

IntermediateLimitations and consistency challenges of read replicas

6

AdvancedConfiguring read replicas in AWS RDS

7

ExpertAdvanced patterns and pitfalls with read replicas

Under the Hood

Read replicas work by copying data changes from the main database asynchronously. The main database writes changes to a log, which replicas read and apply to their copies. This replication uses network communication and database protocols to keep data mostly in sync without blocking writes.

Why designed this way?

Asynchronous replication was chosen to avoid slowing down the main database's writes. Synchronous replication would make writes wait for replicas, reducing performance. The trade-off is eventual consistency, which is acceptable for many applications.

┌───────────────┐       ┌───────────────┐
│ Main Database │──────▶│ Replication   │
│   (Writes)   │       │   Log Stream  │
└───────────────┘       └───────────────┘
                              │
                              ▼
                    ┌───────────────────┐
                    │ Read Replica(s)   │
                    │ (Apply changes)   │
                    └───────────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do read replicas always have the latest data? Commit to yes or no.

Common Belief:Read replicas always show the most current data just like the main database.

Tap to reveal reality

Quick: Can read replicas handle write requests? Commit to yes or no.

Common Belief:Read replicas can handle both read and write requests to balance load.

Tap to reveal reality

Quick: Do read replicas eliminate the need for database scaling? Commit to yes or no.

Common Belief:Using read replicas means you never need to scale the main database or use other scaling methods.

Tap to reveal reality

Quick: Are read replicas always located in the same region as the main database? Commit to yes or no.

Common Belief:Read replicas must be in the same region as the main database to work properly.

Tap to reveal reality

Expert Zone

1

Replication lag varies with network and workload; monitoring it is essential to avoid stale reads.

2

Some queries require strong consistency and must bypass replicas to query the main database directly.

3

Combining read replicas with caching layers and sharding can optimize both read and write scalability.

When NOT to use

Read replicas are not suitable when applications require immediate consistency for all reads or when write load is the main bottleneck. Alternatives include caching, sharding, or using databases with built-in strong consistency and horizontal scaling.

Production Patterns

In production, teams route read-heavy traffic to replicas using application logic or proxies, monitor replication lag closely, and use multi-region replicas for global users. They also combine replicas with caching and failover strategies for high availability.

Connections

Caching

Builds-on

Both caching and read replicas reduce load on the main database but caching stores data temporarily in memory, while replicas keep full copies of the database.

Eventual consistency

Same pattern

Read replicas demonstrate eventual consistency, a concept where data updates propagate over time, important in distributed systems.

Library book copies

Opposite domain analogy

Understanding how libraries use multiple copies to serve many readers helps grasp how read replicas serve many database readers efficiently.

Common Pitfalls

#1Sending write queries to read replicas causing errors.

Wrong approach:INSERT INTO users (name) VALUES ('Alice'); -- sent to read replica

Correct approach:INSERT INTO users (name) VALUES ('Alice'); -- sent to main database only

Root cause:Misunderstanding that replicas can handle writes leads to sending write commands to them.

#2Assuming read replicas always have the latest data and showing stale data to users.

Wrong approach:SELECT * FROM orders WHERE user_id=123; -- sent to replica immediately after write

Correct approach:SELECT * FROM orders WHERE user_id=123; -- sent to main database or after replication lag check

Root cause:Ignoring replication delay causes reading outdated information.

#3Routing all traffic, including writes, to replicas to reduce load.

Wrong approach:Application sends all queries to replicas to improve speed.

Correct approach:Application routes reads to replicas and writes to main database separately.

Root cause:Not separating read and write traffic leads to errors and data inconsistency.

Key Takeaways

Read replicas copy the main database to handle read requests and reduce load on the main database.

They update asynchronously, so data on replicas may be slightly outdated for a short time.

Using read replicas improves performance and reliability but requires careful query routing and monitoring.

Not all queries should go to replicas; writes and strongly consistent reads must go to the main database.

In AWS, read replicas can be created across regions to serve global users faster.