0
0
Microservicessystem_design~7 mins

Two-phase commit (and why to avoid it) in Microservices - System Design Guide

Choose your learning style9 modes available
Problem Statement
When multiple microservices need to update their own databases as part of one user action, a failure in any service can leave data inconsistent. Without coordination, partial updates cause data corruption, leading to user confusion and system errors.
Solution
Two-phase commit coordinates all involved services to agree on committing or aborting a transaction. First, a coordinator asks each service if it can commit (prepare phase). If all agree, the coordinator tells them to commit (commit phase). If any service cannot commit, all roll back to keep data consistent.
Architecture
Coordinator
Service A DB
Service B DB

This diagram shows the coordinator communicating with multiple service databases in two phases: prepare and commit, ensuring all-or-nothing updates.

Trade-offs
✓ Pros
Ensures strong consistency across multiple services by coordinating commits.
Prevents partial updates that cause data corruption.
Simple conceptually for small numbers of services.
✗ Cons
Blocks resources during the prepare phase, reducing system availability.
Single coordinator failure can halt progress, causing system stalls.
Does not scale well with many services due to increased latency and complexity.
Use when strict consistency is mandatory and the number of services involved is small (under 3-5), and the system can tolerate blocking during commit.
Avoid when system requires high availability, low latency, or involves many services; also avoid if coordinator failure cannot be quickly recovered.
Real World Examples
Amazon
Used two-phase commit in early order processing to ensure inventory and payment services updated atomically before moving to eventual consistency models.
Uber
Applied two-phase commit in limited cases for trip booking to ensure driver and rider data consistency before switching to compensation-based patterns.
LinkedIn
Used two-phase commit in legacy systems for profile updates spanning multiple databases before adopting event-driven eventual consistency.
Code Example
The before code shows independent updates that can fail partially. The after code introduces a coordinator that asks each service to prepare, then commits all or rolls back all, ensuring atomicity.
Microservices
### Before (no coordination, risk of partial failure)
class ServiceA:
    def update(self):
        # update local DB
        pass

class ServiceB:
    def update(self):
        # update local DB
        pass

# Caller calls both updates independently
service_a = ServiceA()
service_b = ServiceB()
service_a.update()
service_b.update()  # if this fails, data is inconsistent


### After (Two-phase commit coordinator)
class Coordinator:
    def __init__(self, services):
        self.services = services

    def two_phase_commit(self):
        # Phase 1: prepare
        for s in self.services:
            if not s.prepare():
                self.rollback_all()
                return False
        # Phase 2: commit
        for s in self.services:
            s.commit()
        return True

    def rollback_all(self):
        for s in self.services:
            s.rollback()

class ServiceA:
    def prepare(self):
        # check if update possible
        return True
    def commit(self):
        # apply update
        pass
    def rollback(self):
        # undo changes
        pass

class ServiceB:
    def prepare(self):
        return True
    def commit(self):
        pass
    def rollback(self):
        pass

services = [ServiceA(), ServiceB()]
coordinator = Coordinator(services)
coordinator.two_phase_commit()
OutputSuccess
Alternatives
Saga pattern
Breaks a transaction into a sequence of local transactions with compensating actions instead of a global lock.
Use when: Use when you need higher availability and scalability with eventual consistency.
Eventual consistency with event sourcing
Services update independently and communicate changes via events, accepting temporary inconsistencies.
Use when: Choose when system can tolerate temporary inconsistencies and requires high throughput.
Summary
Two-phase commit prevents partial updates by coordinating all services to commit or rollback together.
It ensures strong consistency but can block system progress and reduce availability.
Modern microservices often avoid it in favor of eventual consistency patterns like sagas.