Microservicessystem_design~10 mins

Event store concept in Microservices - Scalability & System Analysis

Choose your learning style10 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Scalability Analysis - Event store concept

Growth Table: Event Store Scaling

Users / Events	100 Users	10K Users	1M Users	100M Users
Event Volume per Day	~10K events	~1M events	~100M events	~10B events
Event Store Size	~100 MB	~10 GB	~1 TB	~100 TB+
Write Throughput	~100 QPS	~10K QPS	~1M QPS	~100M QPS (distributed)
Read Throughput	~100 QPS	~10K QPS	~1M QPS	~100M QPS (distributed)
Latency	Low (ms)	Low (ms)	Moderate (ms to 10s ms)	Higher (tens of ms)
Infrastructure	Single server or small cluster	Cluster with replication	Sharded clusters, partitioned storage	Global distributed clusters, multi-region

First Bottleneck

The first bottleneck is the event store database write throughput. As events grow, the database struggles to handle the high volume of writes and maintain low latency. This is because event stores append many small writes, which can saturate disk I/O and CPU on a single node.

Scaling Solutions

Horizontal scaling: Add more event store nodes and partition events by aggregate or stream ID (sharding) to distribute write load.
Write batching: Group multiple events into batches to reduce I/O overhead.
Caching: Use in-memory caches for recent events or snapshots to speed up reads.
Event snapshots: Periodically store snapshots of aggregate state to reduce replay time.
Replication: Use read replicas to scale read throughput and improve availability.
Storage tiering: Archive older events to cheaper, slower storage to keep hot storage performant.
Use specialized event store databases: Databases optimized for append-only workloads (e.g., Apache Kafka, EventStoreDB) improve performance.

Back-of-Envelope Cost Analysis

At 10K users generating 1M events/day, expect ~12 QPS sustained writes (1M / 86400 seconds).
At 1M users generating 100M events/day, expect ~1,157 QPS sustained writes.
Storage needed grows roughly 1 KB per event, so 100M events ~100 GB per day.
Network bandwidth must support event replication and client reads; 1 Gbps network can handle ~125 MB/s, enough for ~125K events/s at 1 KB each.
CPU and disk I/O must be provisioned to handle peak bursts, not just average QPS.

Interview Tip

Start by explaining the event store's role as an append-only log of events. Discuss how writes dominate the workload and how latency matters. Then, identify the database write throughput as the first bottleneck. Propose sharding and replication as solutions. Mention caching and snapshots to optimize reads. Finally, consider storage growth and archival strategies. Keep your explanation clear and structured.

Self Check Question

Your event store database handles 1000 QPS writes. Traffic grows 10x to 10,000 QPS. What do you do first and why?

Answer: The first step is to shard the event store by partitioning events across multiple nodes. This distributes the write load so no single database node is overwhelmed, allowing the system to handle 10x more writes without latency spikes.

Key Result

The event store first breaks at database write throughput as event volume grows. Sharding and replication are key to scaling writes and reads efficiently.

Practice

(1/5)

1. What is the primary purpose of an event store in a microservices architecture?

easy

A. To save every change as an immutable event in order

B. To store user credentials securely

C. To cache frequently accessed data for faster reads

D. To manage service discovery and load balancing

Event store concept in Microservices - Scalability & System Analysis

Start learning this pattern below

Practice

Solution

Step 1: Understand event store role

Step 2: Compare options with event store purpose

Final Answer:

Quick Check:

Solution

Step 1: Identify event store data structure

Step 2: Match options to event store structure

Final Answer:

Quick Check:

Solution

Step 1: Replay events in order

Step 2: Determine final user state

Final Answer:

Quick Check:

Solution

Step 1: Understand immutability in event stores

Step 2: Analyze impact of updating events

Final Answer:

Quick Check:

Solution

Step 1: Identify replay challenges with many events

Step 2: Evaluate solutions to speed up rebuilding

Final Answer:

Quick Check: