0
0
Kafkadevops~12 mins

Saga pattern for distributed transactions in Kafka - Step-by-Step Execution

Choose your learning style9 modes available
Process Flow - Saga pattern for distributed transactions
Order Service publishes event
Kafka: saga-events topic
Payment Service consumes
Publish [Publish compensation
Kafka: saga-compensation
Order Service rolls back
Transaction complete
The order service starts a saga by publishing an event to Kafka. Each downstream service consumes, processes, and publishes result events. On failure, compensation events trigger rollback of previously completed steps.
Execution Sample
Kafka
# Step 1: Create saga topics
kafka-topics.sh --create --topic saga-events --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
kafka-topics.sh --create --topic saga-compensation --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

# Step 2: Publish order-created event
echo '{"order_id":"ORD-001","step":"order-created","status":"success"}' | kafka-console-producer.sh --bootstrap-server localhost:9092 --topic saga-events

# Step 3: Consume and verify
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic saga-events --from-beginning --max-messages 1

# Step 4: Simulate payment failure - publish compensation
echo '{"order_id":"ORD-001","step":"payment-failed","action":"rollback-order"}' | kafka-console-producer.sh --bootstrap-server localhost:9092 --topic saga-compensation

# Step 5: Verify compensation event
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic saga-compensation --from-beginning --max-messages 1
This demonstrates a simplified saga flow: creating coordination topics, publishing an order event, simulating a payment failure, and publishing a compensation event for rollback.
Process Table
StepActionsaga-events Topicsaga-compensation TopicTransaction StateNext Step
1Create saga-events and saga-compensation topicsCreated (empty)Created (empty)Not startedPublish order event
2Order Service publishes order-created event1 message (ORD-001, order-created)EmptyStartedPayment Service consumes
3Payment Service consumes order event1 message (consumed)EmptyProcessing paymentProcess payment
4Payment fails — publish compensation event1 message1 message (ORD-001, rollback-order)CompensatingOrder Service consumes compensation
5Order Service consumes compensation event1 message1 message (consumed)Rolled backTransaction ends
6Transaction complete — order cancelled1 message1 messageFailed (compensated)End
💡 Saga completed with compensation. Order rolled back after payment failure. Both topics retain event history.
Status Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
orderStatusNoneCreatedCreatedCompensatingCancelled
paymentStatusNoneNoneProcessingFailedFailed
sagaEventsCount01111
compensationCount00011
transactionStateNot startedStartedProcessingCompensatingRolled back
Key Moments - 3 Insights
Why does the saga use two separate Kafka topics instead of one?
Separating saga-events (forward flow) from saga-compensation (rollback flow) prevents consumers from mixing up progress events with rollback commands. Each topic has its own consumer group, so services only listen to the events relevant to their role.
What happens if the compensation event itself fails to publish?
Kafka's producer retries and acknowledgment (acks=all) ensure the compensation event is durably written. If the producer crashes, the saga orchestrator can detect the incomplete saga via the transaction store and re-publish the compensation event — this is why idempotent consumers are critical.
How is this different from a two-phase commit?
Two-phase commit locks resources across all services until all agree to commit — creating tight coupling and blocking. The saga pattern uses eventual consistency: each service commits locally and compensates on failure. Kafka decouples the services so they never block each other.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table. At which step does the transaction state change to 'Compensating'?
AStep 2 — when the order event is published
BStep 3 — when payment service consumes
CStep 4 — when compensation event is published
DStep 6 — when transaction completes
💡 Hint
Check the Transaction State column — it changes to Compensating when the rollback event enters the compensation topic.
From the variable tracker, what is the compensationCount after the payment failure at Step 4?
A0
B1
C2
D3
💡 Hint
Check the compensationCount row — it increments from 0 to 1 when the compensation event is published.
If Payment Service succeeded but Inventory Service failed, what would change in the flow?
ATwo compensation events — one for payment rollback and one for order rollback
BOnly the order gets rolled back
CNo compensation needed since payment succeeded
DThe entire Kafka cluster restarts
💡 Hint
Each completed saga step needs its own compensation. If inventory fails after payment succeeded, both payment and order must be compensated in reverse order.
Concept Snapshot
The saga pattern coordinates distributed transactions without distributed locks.
Each microservice commits locally and publishes events to Kafka.
On failure, compensation events trigger rollback of completed steps.
Use separate topics for forward events and compensation events.
Idempotent consumers handle retries safely.
Kafka guarantees durable delivery of saga coordination messages.
Full Transcript
This visual execution shows how the saga pattern works with Kafka for distributed transactions. The order service starts the saga by publishing an order-created event to the saga-events topic. The payment service consumes this event and attempts to process payment. When payment fails, instead of rolling back everything in a single transaction, the payment service publishes a compensation event to the saga-compensation topic. The order service consumes this compensation event and rolls back the order. Each step is tracked through the variable tracker showing how order status, payment status, and transaction state evolve. The key insight is that saga uses eventual consistency through events rather than distributed locks, making it scalable and resilient.