Bird
Raised Fist0
Microservicessystem_design~7 mins

Saga pattern for distributed transactions in Microservices - System Design Guide

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Problem Statement
When a business process spans multiple microservices, a failure in one service can leave the system in an inconsistent state because traditional database transactions cannot span services. This leads to partial updates and data corruption without a way to rollback changes across services.
Solution
The Saga pattern breaks a distributed transaction into a series of smaller local transactions in each service. Each local transaction publishes an event or message to trigger the next step. If a step fails, compensating transactions are executed to undo previous changes, ensuring eventual consistency without locking resources across services.
Architecture
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│ Service A   │─────▶│ Service B   │─────▶│ Service C   │
│ (Local Tx)  │      │ (Local Tx)  │      │ (Local Tx)  │
└──────┬──────┘      └──────┬──────┘      └──────┬──────┘
       │                   │                   │
       │                   │                   │
       ▼                   ▼                   ▼
Compensate A◀────────Compensate B◀────────Compensate C
(Undo Tx)              (Undo Tx)            (Undo Tx)

This diagram shows a sequence of local transactions across three services with forward events triggering the next step. If any step fails, compensating transactions are triggered in reverse order to undo changes.

Trade-offs
✓ Pros
Enables distributed transactions without locking resources across services.
Improves system availability by avoiding global locks and long transactions.
Supports eventual consistency with clear rollback mechanisms via compensations.
Fits well with event-driven microservices architectures.
✗ Cons
Increases complexity due to managing compensating transactions and failure scenarios.
Requires careful design to ensure compensations correctly undo previous steps.
Eventual consistency means temporary data inconsistency visible to users.
Use when business processes span multiple microservices and strong consistency with distributed locking is not feasible. Suitable for systems with medium to high transaction volumes where eventual consistency is acceptable.
Avoid when strict ACID transactions are required across services or when compensating transactions are impossible or too complex to implement.
Real World Examples
Amazon
Amazon uses the Saga pattern to manage order processing across inventory, payment, and shipping microservices, ensuring orders are either fully processed or properly compensated.
Uber
Uber applies Saga to coordinate ride booking steps across services like driver assignment, payment, and notifications, handling failures gracefully without locking.
Netflix
Netflix uses Saga to maintain consistency in user subscription and billing microservices, allowing independent service updates with compensations on failure.
Code Example
The before code tries to manually rollback on failure, which is error-prone and scattered. The after code uses a Saga class to coordinate steps and compensations cleanly, ensuring consistent rollback if any step fails.
Microservices
### Before: No Saga, naive distributed transaction
class OrderService:
    def create_order(self, order):
        inventory_result = InventoryService.reserve(order.items)
        if not inventory_result:
            return False
        payment_result = PaymentService.charge(order.payment_info)
        if not payment_result:
            InventoryService.release(order.items)  # manual rollback
            return False
        ShippingService.schedule(order)
        return True

### After: Saga pattern with compensating transactions
class OrderSaga:
    def execute(self, order):
        try:
            InventoryService.reserve(order.items)
            PaymentService.charge(order.payment_info)
            ShippingService.schedule(order)
        except Exception as e:
            self.compensate(order)
            raise e

    def compensate(self, order):
        ShippingService.cancel(order)
        PaymentService.refund(order.payment_info)
        InventoryService.release(order.items)
OutputSuccess
Alternatives
Two-Phase Commit (2PC)
2PC uses a coordinator to lock resources and commit or rollback all services atomically, blocking resources during the transaction.
Use when: Choose 2PC when strict atomicity and consistency are mandatory and the system can tolerate blocking and lower availability.
Eventual Consistency with Event Sourcing
Event sourcing stores all changes as events and rebuilds state from them, focusing on immutable logs rather than compensations.
Use when: Choose event sourcing when auditability and replayability of state changes are priorities and eventual consistency is acceptable.
Summary
Saga pattern manages distributed transactions by splitting them into local transactions with compensations.
It avoids locking resources across services and supports eventual consistency.
Compensating transactions undo previous steps if a failure occurs, ensuring system consistency.

Practice

(1/5)
1. What is the main purpose of the Saga pattern in microservices?
easy
A. To replicate data across multiple databases synchronously
B. To manage distributed transactions by breaking them into smaller steps with compensations
C. To speed up database queries by caching results
D. To lock all resources until the transaction completes

Solution

  1. Step 1: Understand distributed transactions challenges

    Distributed transactions across microservices are hard because locking resources is inefficient and can cause delays.
  2. Step 2: Identify Saga pattern role

    The Saga pattern breaks a big transaction into smaller steps, each with a compensating action to undo if needed, avoiding locks.
  3. Final Answer:

    To manage distributed transactions by breaking them into smaller steps with compensations -> Option B
  4. Quick Check:

    Saga pattern = distributed transaction management [OK]
Hint: Saga means small steps with undo actions for transactions [OK]
Common Mistakes:
  • Thinking Saga locks resources like traditional transactions
  • Confusing Saga with caching or replication
  • Assuming Saga runs all steps in parallel
2. Which of the following is the correct sequence in a Saga pattern transaction?
easy
A. Execute steps and compensations simultaneously
B. Run compensations first, then execute all steps
C. Execute only compensations without any steps
D. Execute steps sequentially, then run compensations if any step fails

Solution

  1. Step 1: Understand Saga execution flow

    Saga executes each step in order. If a step fails, compensations undo previous steps.
  2. Step 2: Confirm correct sequence

    Compensations run only after a failure, never before or simultaneously with steps.
  3. Final Answer:

    Execute steps sequentially, then run compensations if any step fails -> Option D
  4. Quick Check:

    Steps then compensations = correct Saga flow [OK]
Hint: Steps run first; compensations only if failure occurs [OK]
Common Mistakes:
  • Running compensations before any step
  • Running steps and compensations at the same time
  • Skipping compensations on failure
3. Consider a Saga with three steps: A, B, and C. Step B fails after A succeeds. What happens next?
medium
A. Saga retries step B indefinitely without compensation
B. Step C runs regardless of failure
C. Compensation for step A runs, then Saga aborts
D. No compensation runs; Saga commits partial results

Solution

  1. Step 1: Analyze failure impact in Saga

    When step B fails, Saga must undo previous successful steps to keep data consistent.
  2. Step 2: Identify compensation actions

    Compensation for step A runs to rollback its changes, then Saga aborts without running step C.
  3. Final Answer:

    Compensation for step A runs, then Saga aborts -> Option C
  4. Quick Check:

    Failure triggers compensation rollback [OK]
Hint: Failure in middle triggers compensations backward [OK]
Common Mistakes:
  • Assuming later steps run after failure
  • Thinking Saga retries endlessly without rollback
  • Ignoring compensation steps
4. A developer implemented a Saga but noticed data inconsistencies after failures. What is the most likely cause?
medium
A. Compensation actions are missing or incomplete
B. All steps are executed synchronously
C. Steps are too small and independent
D. Saga pattern locks all resources during execution

Solution

  1. Step 1: Identify cause of inconsistencies

    Data inconsistencies after failure usually mean rollback (compensation) did not happen properly.
  2. Step 2: Check compensation implementation

    If compensation actions are missing or incomplete, previous steps cannot be undone, causing inconsistency.
  3. Final Answer:

    Compensation actions are missing or incomplete -> Option A
  4. Quick Check:

    Missing compensation = inconsistency [OK]
Hint: Always implement full compensations for each step [OK]
Common Mistakes:
  • Assuming synchronous execution causes inconsistency
  • Believing small steps cause inconsistency
  • Thinking Saga locks resources like traditional transactions
5. You design a payment system using Saga pattern with steps: debit account, reserve inventory, and confirm order. If inventory reservation fails, what should happen?
hard
A. Run compensation to credit back the debited amount and abort order confirmation
B. Ignore failure and proceed to confirm order
C. Retry inventory reservation indefinitely without compensation
D. Lock all services until inventory is reserved

Solution

  1. Step 1: Understand Saga compensation in payment flow

    If inventory reservation fails, previous successful steps (debit account) must be undone to avoid inconsistent state.
  2. Step 2: Apply compensation and abort

    Compensation credits back the debited amount, and order confirmation is aborted to maintain consistency.
  3. Final Answer:

    Run compensation to credit back the debited amount and abort order confirmation -> Option A
  4. Quick Check:

    Failure triggers compensation rollback and abort [OK]
Hint: Failure in middle step triggers rollback of prior steps [OK]
Common Mistakes:
  • Proceeding despite failure causing inconsistent state
  • Retrying endlessly without rollback
  • Locking services defeats Saga benefits