Overview - Event sourcing pattern

What is it?

Event sourcing is a way to store data by saving every change as a sequence of events instead of just the current state. Each event represents a fact that happened in the system. This lets you rebuild the current state anytime by replaying all events in order. It is often used with Kafka, a tool that handles streams of events efficiently.

Why it matters

Without event sourcing, systems only keep the latest data, losing the history of how that data changed. This makes it hard to track bugs, audit actions, or recover lost data. Event sourcing solves this by keeping a full history, making systems more reliable, transparent, and easier to fix when problems happen.

Where it fits

Before learning event sourcing, you should understand basic data storage and messaging systems like Kafka. After mastering event sourcing, you can explore related patterns like CQRS (Command Query Responsibility Segregation) and stream processing for building scalable, reactive systems.

Mental Model

Core Idea

Event sourcing stores every change as an immutable event, letting you rebuild the current state by replaying these events in order.

Think of it like...

Imagine writing a diary where you record every action you take each day instead of just writing your current mood. Later, you can read the diary from the start to understand how you got to your current mood.

┌───────────────┐
│ Event Store   │
│ (Kafka Topic) │
└──────┬────────┘
       │
       ▼
┌───────────────┐      ┌───────────────┐
│ Replay Events │─────▶│ Current State │
│ in order      │      │ (Rebuilt)     │
└───────────────┘      └───────────────┘

Build-Up - 7 Steps

1

FoundationUnderstanding events as facts

Concept: Events represent facts or changes that happened in the system, not just data snapshots.

An event is a record of something that happened, like 'UserCreated' or 'OrderPlaced'. Each event is immutable, meaning once saved, it never changes. This differs from traditional databases that overwrite data.

Result

You start thinking of data as a timeline of facts, not just a current snapshot.

Understanding that events are facts helps you see why storing them preserves history perfectly.

2

FoundationEvent store as the single source

3

IntermediateRebuilding state by replaying events

4

IntermediateUsing Kafka for event sourcing

5

IntermediateSnapshotting to optimize replay

6

AdvancedHandling event schema evolution

7

ExpertEvent sourcing tradeoffs and pitfalls

Under the Hood

Event sourcing systems append each event to an immutable log stored in an event store like Kafka. Each event has a unique offset and timestamp, ensuring order. Consumers read events sequentially, applying them to build or update state. Kafka's partitioning and replication ensure durability and scalability. Snapshots store intermediate states to optimize replay. Schema registries manage event format versions to maintain compatibility.

Why designed this way?

Event sourcing was designed to solve problems of lost history and difficult recovery in traditional databases. By storing immutable events, systems gain full audit trails and can recover from failures by replaying events. Kafka was chosen for its high-throughput, ordered, and durable event storage, fitting event sourcing needs better than traditional databases. The design balances immutability, scalability, and fault tolerance.

┌───────────────┐
│ Event Producer│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Kafka Topic   │
│ (Event Store) │
└──────┬────────┘
       │
       ▼
┌───────────────┐      ┌───────────────┐
│ Event Consumer│─────▶│ State Builder │
│ (Replayer)    │      │ (Applies     │
└───────────────┘      │ events to    │
                       │ build state) │
                       └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does event sourcing mean storing only the latest state? Commit yes or no.

Common Belief:Event sourcing just stores the current state like a normal database.

Tap to reveal reality

Quick: Can you freely change event formats without issues? Commit yes or no.

Common Belief:You can change event formats anytime without affecting the system.

Tap to reveal reality

Quick: Does event sourcing always simplify querying data? Commit yes or no.

Common Belief:Event sourcing makes querying data simpler because all history is stored.

Tap to reveal reality

Quick: Is event sourcing suitable for every application? Commit yes or no.

Common Belief:Event sourcing is the best pattern for all data storage needs.

Tap to reveal reality

Expert Zone

1

Event ordering in distributed systems can be tricky; understanding Kafka partitions and keys is crucial to maintain correct event sequences.

2

Snapshot frequency is a tradeoff: too frequent wastes storage, too rare slows recovery; tuning depends on system load and event volume.

3

Event sourcing requires careful handling of eventual consistency, especially when multiple services consume and react to events asynchronously.

When NOT to use

Avoid event sourcing for simple applications with minimal audit needs or where immediate consistency is critical. Use traditional CRUD databases or caching layers instead. Also, if event volume is low and history is not important, event sourcing adds unnecessary complexity.

Production Patterns

In production, event sourcing is combined with CQRS to separate read and write models, using Kafka for event storage and stream processing frameworks like Kafka Streams or ksqlDB for projections. Snapshots are stored in fast-access databases to speed up state rebuilds. Schema registries manage event versions to ensure smooth upgrades.

Connections

Command Query Responsibility Segregation (CQRS)

Builds-on

Knowing event sourcing helps understand CQRS, which separates commands (writes) from queries (reads) to optimize system performance.

Immutable Ledger in Blockchain

Same pattern

Both event sourcing and blockchain store immutable sequences of events or transactions, ensuring full history and auditability.

Version Control Systems (e.g., Git)

Similar pattern

Like event sourcing, version control records every change as a commit, allowing you to replay history and understand how the current state evolved.

Common Pitfalls

#1Replaying all events from the start every time causes slow recovery.

Wrong approach:On system restart, replay all events from offset 0 without snapshots.

Correct approach:Use snapshots to start replay from the latest saved state, then apply only newer events.

Root cause:Not using snapshots ignores performance optimization, making recovery inefficient.

#2Changing event schema without versioning breaks consumers.

Wrong approach:Modify event JSON structure directly without schema registry or version control.

Correct approach:Use a schema registry and version events to maintain backward compatibility.

Root cause:Lack of schema management causes incompatible event formats and runtime errors.

#3Treating event sourcing as a simple database replacement without adjusting queries.

Wrong approach:Query current state directly from event store without projections or snapshots.

Correct approach:Build read models or projections optimized for queries, separate from event store.

Root cause:Misunderstanding event sourcing's separation of write and read concerns leads to poor performance.

Key Takeaways

Event sourcing stores every change as an immutable event, preserving full history and enabling state reconstruction.

Kafka is a powerful tool for event sourcing because it stores ordered, durable event streams that can be replayed anytime.

Snapshots optimize performance by saving intermediate states, reducing the need to replay all events from the start.

Careful schema management is essential to evolve events without breaking consumers or replay processes.

Event sourcing adds complexity and is best used when auditability, recovery, and history are critical requirements.