Bird
Raised Fist0
Microservicessystem_design~10 mins

CQRS pattern in Microservices - Scalability & System Analysis

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Scalability Analysis - CQRS pattern
Growth Table: CQRS Pattern Scaling
Users/TrafficWrite ModelRead ModelData SyncInfrastructure
100 usersSingle write service, single DBSingle read DB, simple queriesSimple event propagation, low latency1 app server, 1 DB server
10K usersWrite service scales horizontally, DB with connection poolingRead replicas added, caching introducedEvent queue for async updatesMultiple app servers, read replicas
1M usersWrite DB sharded by user or domain, write service scaledRead DBs sharded, heavy caching (Redis/ElasticSearch)Robust event streaming (Kafka), eventual consistencyClustered microservices, message brokers
100M usersMulti-region write DB shards, global write servicesGlobal read replicas, CDN for read-heavy dataHighly available event streaming, conflict resolutionGeo-distributed clusters, advanced monitoring
First Bottleneck

At small scale, the write database is the first bottleneck because all commands must be processed and stored reliably. As users grow, the write DB hits limits on connections and transaction throughput.

Read side scales easier with replicas and caching, so write DB capacity and consistency become the main challenge.

Scaling Solutions
  • Horizontal scaling: Add more write service instances behind a load balancer.
  • Database sharding: Split write DB by user or domain to reduce contention.
  • Read replicas: Use multiple read-only DB replicas to handle query load.
  • Caching: Use in-memory caches (Redis) or search engines (ElasticSearch) for fast reads.
  • Event streaming: Use message brokers (Kafka) for reliable async data sync between write and read models.
  • Conflict resolution: Implement eventual consistency and handle conflicts gracefully.
  • Geo-distribution: Deploy services and DB shards in multiple regions for latency and availability.
Back-of-Envelope Cost Analysis

Assuming 1M users with 10,000 requests per second (RPS) total:

  • Write QPS: ~1000 (assuming 10% writes)
  • Read QPS: ~9000 (read-heavy)
  • Write DB: Needs to handle ~1000 transactions/sec, requires sharding or strong scaling
  • Read DB replicas: Each can handle ~5000 QPS, so 2 replicas suffice
  • Cache: Redis can handle 100K ops/sec, enough for read caching
  • Network bandwidth: 1 Gbps (~125 MB/s) sufficient for event streaming and API traffic
  • Storage: Write DB stores all commands, estimate 1 KB per write -> ~1 MB/s write storage growth
Interview Tip

Start by explaining the separation of write and read models in CQRS and why it helps scaling.

Discuss bottlenecks on the write side first, then how read replicas and caching improve read scalability.

Mention event streaming for syncing data and eventual consistency trade-offs.

Finally, talk about sharding and geo-distribution for very large scale.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Introduce database sharding or add write DB replicas with partitioning to distribute load, because a single DB cannot handle 10,000 QPS reliably.

Key Result
CQRS scales well by separating write and read workloads; the write database is the first bottleneck and requires sharding or horizontal scaling, while the read side benefits from replicas and caching.

Practice

(1/5)
1. What is the main purpose of the CQRS pattern in microservices architecture?
easy
A. To separate read and write operations for better scalability
B. To combine all database operations into a single service
C. To encrypt data during transmission between services
D. To cache all data on the client side for faster access

Solution

  1. Step 1: Understand CQRS concept

    CQRS stands for Command Query Responsibility Segregation, which means separating commands (writes) from queries (reads).
  2. Step 2: Identify the main benefit

    This separation allows each side to be optimized and scaled independently, improving performance and maintainability.
  3. Final Answer:

    To separate read and write operations for better scalability -> Option A
  4. Quick Check:

    CQRS = Separate reads and writes [OK]
Hint: CQRS splits commands and queries separately [OK]
Common Mistakes:
  • Thinking CQRS merges all operations into one service
  • Confusing CQRS with encryption or caching
  • Assuming CQRS only applies to database encryption
2. Which of the following is the correct way to describe the command side in CQRS?
easy
A. Handles read-only queries to fetch data
B. Manages user authentication and sessions
C. Processes write operations that change state
D. Caches data for faster retrieval

Solution

  1. Step 1: Define command side role

    The command side in CQRS is responsible for handling commands, which are operations that change the system's state (writes).
  2. Step 2: Eliminate incorrect options

    Read-only queries belong to the query side, caching is a separate concern, and authentication is unrelated to CQRS commands.
  3. Final Answer:

    Processes write operations that change state -> Option C
  4. Quick Check:

    Command side = writes [OK]
Hint: Commands change data, queries read data [OK]
Common Mistakes:
  • Confusing command side with query side
  • Thinking command side handles caching
  • Mixing authentication with CQRS commands
3. Given the following simplified CQRS flow:
1. User sends a command to update an order.
2. Command handler updates the write database.
3. An event is published.
4. The read model updates asynchronously.
What is the main reason for step 4?
medium
A. To validate the command before processing
B. To keep the read database in sync with the write database
C. To rollback the write operation if needed
D. To encrypt the data before sending to the client

Solution

  1. Step 1: Understand event role in CQRS

    After the write database updates, an event signals that data changed.
  2. Step 2: Purpose of read model update

    The read model updates asynchronously to reflect the latest data for queries, keeping it consistent with writes.
  3. Final Answer:

    To keep the read database in sync with the write database -> Option B
  4. Quick Check:

    Event updates read model = sync reads [OK]
Hint: Events update read model after writes [OK]
Common Mistakes:
  • Thinking event validates or rolls back commands
  • Confusing encryption with event handling
  • Assuming read model updates happen synchronously
4. In a CQRS system, a developer notices that the read model sometimes shows stale data after a write. What is the most likely cause?
medium
A. The client is caching old data aggressively
B. The command handler failed to update the write database
C. The write database is not replicated properly
D. The event to update the read model is delayed or lost

Solution

  1. Step 1: Identify cause of stale read data

    In CQRS, the read model updates asynchronously via events. If events are delayed or lost, the read model lags behind.
  2. Step 2: Rule out other causes

    If the write database failed, writes wouldn't succeed. Client caching or replication issues are less likely to cause this specific CQRS symptom.
  3. Final Answer:

    The event to update the read model is delayed or lost -> Option D
  4. Quick Check:

    Stale reads = delayed event update [OK]
Hint: Stale reads usually mean event delay or loss [OK]
Common Mistakes:
  • Blaming write database failure without evidence
  • Ignoring event delivery reliability
  • Assuming client caching is always the cause
5. You are designing a high-traffic e-commerce system using CQRS. Which approach best handles the challenge of scaling the read side independently from the write side?
hard
A. Use separate databases for read and write models with event-driven synchronization
B. Use a single database for both reads and writes with strong locking
C. Cache all writes on the client and batch update the database later
D. Directly query the write database for all read requests

Solution

  1. Step 1: Understand scaling needs in CQRS

    Separating read and write databases allows independent scaling and optimization for each workload.
  2. Step 2: Evaluate options for scaling reads

    Event-driven synchronization keeps the read database updated asynchronously, enabling fast, scalable queries without locking.
  3. Step 3: Reject unsuitable options

    Single database with locking limits scalability; client caching risks data loss; querying write DB for reads causes contention.
  4. Final Answer:

    Use separate databases for read and write models with event-driven synchronization -> Option A
  5. Quick Check:

    Separate DBs + events = scalable CQRS [OK]
Hint: Separate read/write DBs with events scale best [OK]
Common Mistakes:
  • Using one DB with locking reduces scalability
  • Relying on client caching risks consistency
  • Reading directly from write DB causes contention