0
0
Microservicessystem_design~10 mins

CQRS pattern in Microservices - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - CQRS pattern
Growth Table: CQRS Pattern Scaling
Users/TrafficWrite ModelRead ModelData SyncInfrastructure
100 usersSingle write service, single DBSingle read DB, simple queriesSimple event propagation, low latency1 app server, 1 DB server
10K usersWrite service scales horizontally, DB with connection poolingRead replicas added, caching introducedEvent queue for async updatesMultiple app servers, read replicas
1M usersWrite DB sharded by user or domain, write service scaledRead DBs sharded, heavy caching (Redis/ElasticSearch)Robust event streaming (Kafka), eventual consistencyClustered microservices, message brokers
100M usersMulti-region write DB shards, global write servicesGlobal read replicas, CDN for read-heavy dataHighly available event streaming, conflict resolutionGeo-distributed clusters, advanced monitoring
First Bottleneck

At small scale, the write database is the first bottleneck because all commands must be processed and stored reliably. As users grow, the write DB hits limits on connections and transaction throughput.

Read side scales easier with replicas and caching, so write DB capacity and consistency become the main challenge.

Scaling Solutions
  • Horizontal scaling: Add more write service instances behind a load balancer.
  • Database sharding: Split write DB by user or domain to reduce contention.
  • Read replicas: Use multiple read-only DB replicas to handle query load.
  • Caching: Use in-memory caches (Redis) or search engines (ElasticSearch) for fast reads.
  • Event streaming: Use message brokers (Kafka) for reliable async data sync between write and read models.
  • Conflict resolution: Implement eventual consistency and handle conflicts gracefully.
  • Geo-distribution: Deploy services and DB shards in multiple regions for latency and availability.
Back-of-Envelope Cost Analysis

Assuming 1M users with 10,000 requests per second (RPS) total:

  • Write QPS: ~1000 (assuming 10% writes)
  • Read QPS: ~9000 (read-heavy)
  • Write DB: Needs to handle ~1000 transactions/sec, requires sharding or strong scaling
  • Read DB replicas: Each can handle ~5000 QPS, so 2 replicas suffice
  • Cache: Redis can handle 100K ops/sec, enough for read caching
  • Network bandwidth: 1 Gbps (~125 MB/s) sufficient for event streaming and API traffic
  • Storage: Write DB stores all commands, estimate 1 KB per write -> ~1 MB/s write storage growth
Interview Tip

Start by explaining the separation of write and read models in CQRS and why it helps scaling.

Discuss bottlenecks on the write side first, then how read replicas and caching improve read scalability.

Mention event streaming for syncing data and eventual consistency trade-offs.

Finally, talk about sharding and geo-distribution for very large scale.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Introduce database sharding or add write DB replicas with partitioning to distribute load, because a single DB cannot handle 10,000 QPS reliably.

Key Result
CQRS scales well by separating write and read workloads; the write database is the first bottleneck and requires sharding or horizontal scaling, while the read side benefits from replicas and caching.