| Users / Events | 100 Users | 10K Users | 1M Users | 100M Users |
|---|---|---|---|---|
| Event Volume per Day | ~10K events | ~1M events | ~100M events | ~10B events |
| Event Store Size | ~100 MB | ~10 GB | ~1 TB | ~100 TB+ |
| Write Throughput | ~100 QPS | ~10K QPS | ~1M QPS | ~100M QPS (distributed) |
| Read Throughput | ~100 QPS | ~10K QPS | ~1M QPS | ~100M QPS (distributed) |
| Latency | Low (ms) | Low (ms) | Moderate (ms to 10s ms) | Higher (tens of ms) |
| Infrastructure | Single server or small cluster | Cluster with replication | Sharded clusters, partitioned storage | Global distributed clusters, multi-region |
Event store concept in Microservices - Scalability & System Analysis
Start learning this pattern below
Jump into concepts and practice - no test required
The first bottleneck is the event store database write throughput. As events grow, the database struggles to handle the high volume of writes and maintain low latency. This is because event stores append many small writes, which can saturate disk I/O and CPU on a single node.
- Horizontal scaling: Add more event store nodes and partition events by aggregate or stream ID (sharding) to distribute write load.
- Write batching: Group multiple events into batches to reduce I/O overhead.
- Caching: Use in-memory caches for recent events or snapshots to speed up reads.
- Event snapshots: Periodically store snapshots of aggregate state to reduce replay time.
- Replication: Use read replicas to scale read throughput and improve availability.
- Storage tiering: Archive older events to cheaper, slower storage to keep hot storage performant.
- Use specialized event store databases: Databases optimized for append-only workloads (e.g., Apache Kafka, EventStoreDB) improve performance.
- At 10K users generating 1M events/day, expect ~12 QPS sustained writes (1M / 86400 seconds).
- At 1M users generating 100M events/day, expect ~1,157 QPS sustained writes.
- Storage needed grows roughly 1 KB per event, so 100M events ~100 GB per day.
- Network bandwidth must support event replication and client reads; 1 Gbps network can handle ~125 MB/s, enough for ~125K events/s at 1 KB each.
- CPU and disk I/O must be provisioned to handle peak bursts, not just average QPS.
Start by explaining the event store's role as an append-only log of events. Discuss how writes dominate the workload and how latency matters. Then, identify the database write throughput as the first bottleneck. Propose sharding and replication as solutions. Mention caching and snapshots to optimize reads. Finally, consider storage growth and archival strategies. Keep your explanation clear and structured.
Your event store database handles 1000 QPS writes. Traffic grows 10x to 10,000 QPS. What do you do first and why?
Answer: The first step is to shard the event store by partitioning events across multiple nodes. This distributes the write load so no single database node is overwhelmed, allowing the system to handle 10x more writes without latency spikes.
Practice
event store in a microservices architecture?Solution
Step 1: Understand event store role
An event store records all changes as events, preserving order and immutability.Step 2: Compare options with event store purpose
Only To save every change as an immutable event in order describes saving changes as immutable events in order, which matches event store's main function.Final Answer:
To save every change as an immutable event in order -> Option AQuick Check:
Event store = immutable ordered events [OK]
- Confusing event store with caching layer
- Thinking event store manages security or load balancing
- Assuming event store modifies events after saving
event store?Solution
Step 1: Identify event store data structure
Event stores keep data as an append-only log where events cannot be changed once stored.Step 2: Match options to event store structure
An append-only log of immutable events correctly describes an append-only log of immutable events, unlike mutable stores or caches.Final Answer:
An append-only log of immutable events -> Option BQuick Check:
Event store = append-only immutable log [OK]
- Thinking event store allows event updates
- Confusing event store with relational databases
- Assuming event store is a cache with expiration
1: UserCreated {userId: 1, name: "Alice"}
2: UserNameUpdated {userId: 1, name: "Alicia"}
3: UserDeleted {userId: 1}What is the current state of the user with
userId=1 after replaying these events?Solution
Step 1: Replay events in order
First event creates user Alice, second updates name to Alicia, third deletes the user.Step 2: Determine final user state
After deletion event, user no longer exists regardless of previous name changes.Final Answer:
User does not exist -> Option DQuick Check:
Last event is deletion, so user is gone [OK]
- Ignoring the delete event
- Assuming user name remains after deletion
- Confusing event replay order
Solution
Step 1: Understand immutability in event stores
Events must be immutable to ensure reliable replay and audit trails.Step 2: Analyze impact of updating events
Updating events breaks immutability, leading to inconsistent or incorrect system state.Final Answer:
It breaks the immutability principle, causing inconsistent system state -> Option CQuick Check:
Event immutability = consistent state [OK]
- Thinking event updates improve debugging
- Assuming updates improve performance
- Believing updates speed up replay
Solution
Step 1: Identify replay challenges with many events
Replaying millions of events is slow and inefficient for rebuilding state.Step 2: Evaluate solutions to speed up rebuilding
Snapshots save the state at points in time, allowing replay from snapshot forward, reducing events to process.Final Answer:
Use snapshots to save intermediate states periodically -> Option AQuick Check:
Snapshots optimize replay by reducing event count [OK]
- Deleting old events breaks audit and consistency
- Storing only latest event loses history
- Replaying events out of order causes errors
