| Users/Traffic | Write Model | Read Model | Data Sync | Infrastructure |
|---|---|---|---|---|
| 100 users | Single write service, single DB | Single read DB, simple queries | Simple event propagation, low latency | 1 app server, 1 DB server |
| 10K users | Write service scales horizontally, DB with connection pooling | Read replicas added, caching introduced | Event queue for async updates | Multiple app servers, read replicas |
| 1M users | Write DB sharded by user or domain, write service scaled | Read DBs sharded, heavy caching (Redis/ElasticSearch) | Robust event streaming (Kafka), eventual consistency | Clustered microservices, message brokers |
| 100M users | Multi-region write DB shards, global write services | Global read replicas, CDN for read-heavy data | Highly available event streaming, conflict resolution | Geo-distributed clusters, advanced monitoring |
CQRS pattern in Microservices - Scalability & System Analysis
Start learning this pattern below
Jump into concepts and practice - no test required
At small scale, the write database is the first bottleneck because all commands must be processed and stored reliably. As users grow, the write DB hits limits on connections and transaction throughput.
Read side scales easier with replicas and caching, so write DB capacity and consistency become the main challenge.
- Horizontal scaling: Add more write service instances behind a load balancer.
- Database sharding: Split write DB by user or domain to reduce contention.
- Read replicas: Use multiple read-only DB replicas to handle query load.
- Caching: Use in-memory caches (Redis) or search engines (ElasticSearch) for fast reads.
- Event streaming: Use message brokers (Kafka) for reliable async data sync between write and read models.
- Conflict resolution: Implement eventual consistency and handle conflicts gracefully.
- Geo-distribution: Deploy services and DB shards in multiple regions for latency and availability.
Assuming 1M users with 10,000 requests per second (RPS) total:
- Write QPS: ~1000 (assuming 10% writes)
- Read QPS: ~9000 (read-heavy)
- Write DB: Needs to handle ~1000 transactions/sec, requires sharding or strong scaling
- Read DB replicas: Each can handle ~5000 QPS, so 2 replicas suffice
- Cache: Redis can handle 100K ops/sec, enough for read caching
- Network bandwidth: 1 Gbps (~125 MB/s) sufficient for event streaming and API traffic
- Storage: Write DB stores all commands, estimate 1 KB per write -> ~1 MB/s write storage growth
Start by explaining the separation of write and read models in CQRS and why it helps scaling.
Discuss bottlenecks on the write side first, then how read replicas and caching improve read scalability.
Mention event streaming for syncing data and eventual consistency trade-offs.
Finally, talk about sharding and geo-distribution for very large scale.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Introduce database sharding or add write DB replicas with partitioning to distribute load, because a single DB cannot handle 10,000 QPS reliably.
Practice
CQRS pattern in microservices architecture?Solution
Step 1: Understand CQRS concept
CQRS stands for Command Query Responsibility Segregation, which means separating commands (writes) from queries (reads).Step 2: Identify the main benefit
This separation allows each side to be optimized and scaled independently, improving performance and maintainability.Final Answer:
To separate read and write operations for better scalability -> Option AQuick Check:
CQRS = Separate reads and writes [OK]
- Thinking CQRS merges all operations into one service
- Confusing CQRS with encryption or caching
- Assuming CQRS only applies to database encryption
Solution
Step 1: Define command side role
The command side in CQRS is responsible for handling commands, which are operations that change the system's state (writes).Step 2: Eliminate incorrect options
Read-only queries belong to the query side, caching is a separate concern, and authentication is unrelated to CQRS commands.Final Answer:
Processes write operations that change state -> Option CQuick Check:
Command side = writes [OK]
- Confusing command side with query side
- Thinking command side handles caching
- Mixing authentication with CQRS commands
1. User sends a command to update an order.
2. Command handler updates the write database.
3. An event is published.
4. The read model updates asynchronously.
What is the main reason for step 4?
Solution
Step 1: Understand event role in CQRS
After the write database updates, an event signals that data changed.Step 2: Purpose of read model update
The read model updates asynchronously to reflect the latest data for queries, keeping it consistent with writes.Final Answer:
To keep the read database in sync with the write database -> Option BQuick Check:
Event updates read model = sync reads [OK]
- Thinking event validates or rolls back commands
- Confusing encryption with event handling
- Assuming read model updates happen synchronously
Solution
Step 1: Identify cause of stale read data
In CQRS, the read model updates asynchronously via events. If events are delayed or lost, the read model lags behind.Step 2: Rule out other causes
If the write database failed, writes wouldn't succeed. Client caching or replication issues are less likely to cause this specific CQRS symptom.Final Answer:
The event to update the read model is delayed or lost -> Option DQuick Check:
Stale reads = delayed event update [OK]
- Blaming write database failure without evidence
- Ignoring event delivery reliability
- Assuming client caching is always the cause
Solution
Step 1: Understand scaling needs in CQRS
Separating read and write databases allows independent scaling and optimization for each workload.Step 2: Evaluate options for scaling reads
Event-driven synchronization keeps the read database updated asynchronously, enabling fast, scalable queries without locking.Step 3: Reject unsuitable options
Single database with locking limits scalability; client caching risks data loss; querying write DB for reads causes contention.Final Answer:
Use separate databases for read and write models with event-driven synchronization -> Option AQuick Check:
Separate DBs + events = scalable CQRS [OK]
- Using one DB with locking reduces scalability
- Relying on client caching risks consistency
- Reading directly from write DB causes contention
