Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Design: Distributed Transaction Management with Two-Phase Commit
Design focuses on coordinating distributed transactions using two-phase commit protocol and exploring its drawbacks in microservices. Out of scope are alternative transaction models like event sourcing or saga pattern implementations.
Functional Requirements
FR1: Ensure atomicity of transactions across multiple microservices
FR2: Guarantee all-or-nothing commit for distributed operations
FR3: Handle failures during transaction commit or rollback
FR4: Support concurrent transactions without data corruption
Non-Functional Requirements
NFR1: System must handle up to 1000 distributed transactions per second
NFR2: Transaction commit latency should be under 500ms in normal conditions
NFR3: Availability target of 99.9% uptime
NFR4: Microservices are independently deployable and scalable
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Transaction coordinator service
Microservice participants with prepare and commit endpoints
Persistent logs for transaction state
Timeout and retry mechanisms
Design Patterns
Two-phase commit protocol
Distributed locking
Timeout and failure recovery
Alternatives like Saga pattern for eventual consistency
Reference Architecture
+---------------------+
| Transaction |
| Coordinator Service |
+----------+----------+
|
Prepare / Commit | 2PC Protocol
|
+----------------------+----------------------+
| | |
+---v---+ +---v---+ +---v---+
|Service| |Service| |Service|
| A | | B | | C |
+-------+ +-------+ +-------+
Components
Transaction Coordinator Service
Stateless service with persistent transaction log (e.g., PostgreSQL or distributed consensus store)
Manages transaction lifecycle: sends prepare requests, collects votes, decides commit or abort, and instructs participants
Microservice Participants
REST/gRPC endpoints with local database support
Execute prepare phase by locking resources and validating, then commit or rollback based on coordinator's decision
Persistent Transaction Log
Durable storage like relational DB or distributed consensus system
Stores transaction states to recover from failures and ensure durability
Timeout and Retry Mechanism
Built-in coordinator logic with timers
Detects participant failures or network issues and triggers abort or recovery procedures
Request Flow
1. Client sends distributed transaction request to Transaction Coordinator.
2. Coordinator sends 'prepare' request to all participant microservices.
3. Each participant tries to lock resources and validate transaction, then replies 'vote commit' or 'vote abort'.
4. Coordinator collects all votes; if all vote commit, sends 'commit' command; otherwise sends 'abort'.
5. Participants commit or rollback changes accordingly and acknowledge completion.
6. Coordinator marks transaction as complete in persistent log and responds to client.
Database Schema
Entities:
- Transaction: transaction_id (PK), status (PREPARED, COMMITTED, ABORTED), timestamp
- Participant: participant_id (PK), transaction_id (FK), vote (COMMIT, ABORT), status (PREPARED, COMMITTED, ABORTED)
Relationships:
- One Transaction has many Participants
- Participant records votes and status for recovery and coordination
Scaling Discussion
Bottlenecks
Transaction Coordinator becomes a single point of failure and bottleneck under high load.
Participants lock resources during prepare phase, reducing concurrency and increasing latency.
Network delays or failures cause blocking and long transaction times.
Coordinator waiting for slow or failed participants delays entire transaction.
Increased complexity and coupling reduce microservices independence.
Solutions
Use leader election and replication for coordinator to improve availability and load distribution.
Optimize participant locking strategies and reduce transaction scope to minimize lock duration.
Implement timeouts and failure detection to abort stalled transactions quickly.
Consider alternative patterns like Saga for eventual consistency to avoid blocking.
Partition transactions to reduce cross-service dependencies and improve scalability.
Interview Tips
Time: Spend 10 minutes explaining two-phase commit protocol and its steps, 10 minutes discussing drawbacks and failure scenarios, 10 minutes proposing alternatives and scaling strategies, and 15 minutes answering questions and clarifying trade-offs.
Explain how two-phase commit ensures atomicity across distributed services.
Discuss the blocking problem and impact on availability and latency.
Highlight the coordinator as a potential bottleneck and single point of failure.
Mention real-world challenges like network partitions and participant crashes.
Suggest alternatives like Saga pattern for better scalability and resilience.
Practice
(1/5)
1. What is the main purpose of the two-phase commit protocol in microservices?
easy
A. To automatically retry failed requests
B. To speed up communication between services
C. To allow services to work independently without coordination
D. To ensure all services agree on a transaction before committing
Solution
Step 1: Understand the role of two-phase commit
Two-phase commit is designed to make sure all parts of a distributed transaction agree to commit or abort together.
Step 2: Identify the main goal in microservices
Its main goal is to keep data consistent across multiple services by coordinating their commit decisions.
Final Answer:
To ensure all services agree on a transaction before committing -> Option D
Quick Check:
Two-phase commit = agreement before commit [OK]
Hint: Two-phase commit means all must say yes before commit [OK]
Common Mistakes:
Thinking it speeds up communication
Believing services act independently
Assuming it retries failed requests automatically
2. Which of the following correctly describes the two phases in the two-phase commit protocol?
easy
A. Abort phase where coordinator asks, Prepare phase where services finalize
B. Prepare phase where coordinator asks, Commit phase where services finalize
C. Commit phase where coordinator asks, Prepare phase where services finalize
D. Prepare phase where services finalize, Commit phase where coordinator asks
Solution
Step 1: Recall the two phases names and order
The first phase is the prepare phase where the coordinator asks all services if they can commit.
Step 2: Understand the commit phase
If all agree, the coordinator sends a commit command to finalize the transaction.
Final Answer:
Prepare phase where coordinator asks, Commit phase where services finalize -> Option B
Thinking services finalize before coordinator asks
3. Consider a microservices system using two-phase commit. If one service fails to respond during the prepare phase, what is the expected outcome?
medium
A. The coordinator ignores the failure and proceeds
B. The coordinator commits the transaction anyway
C. The coordinator aborts the transaction and tells all services to rollback
D. The coordinator retries the prepare phase indefinitely
Solution
Step 1: Analyze failure during prepare phase
If any service fails to respond or votes no during prepare, the coordinator must abort to keep consistency.
Step 2: Understand coordinator's action
The coordinator sends abort commands to all services to rollback any partial changes.
Final Answer:
The coordinator aborts the transaction and tells all services to rollback -> Option C
Quick Check:
Failure in prepare = abort transaction [OK]
Hint: Any no or failure in prepare means abort [OK]
Common Mistakes:
Assuming commit happens despite failure
Thinking coordinator retries forever
Ignoring failure and proceeding anyway
4. A developer notices that their two-phase commit implementation causes long delays and system hangs when a service crashes. What is the most likely cause?
medium
A. The coordinator is waiting indefinitely for responses from crashed services
B. The services are committing too quickly without coordination
C. The coordinator is skipping the prepare phase
D. The services are not logging their transactions
Solution
Step 1: Identify cause of delays and hangs
In two-phase commit, the coordinator waits for all services to respond during prepare phase.
Step 2: Understand impact of crashed services
If a service crashes, the coordinator may wait indefinitely, causing delays and system hangs.
Final Answer:
The coordinator is waiting indefinitely for responses from crashed services -> Option A
Quick Check:
Waiting on crashed service = system hang [OK]
Hint: Coordinator waits forever if service crashes [OK]
Common Mistakes:
Thinking services commit too fast causes hangs
Believing skipping prepare phase causes delays
Assuming missing logs cause system hangs
5. Why is two-phase commit often avoided in modern microservices architectures despite ensuring consistency?
hard
A. Because it causes blocking, reduces availability, and hurts scalability
B. Because it does not guarantee data consistency
C. Because it requires no coordination between services
D. Because it is too simple and lacks fault tolerance
Solution
Step 1: Understand drawbacks of two-phase commit
Two-phase commit blocks resources while waiting, reducing system availability and scalability.
Step 2: Recognize why modern systems avoid it
Modern microservices prefer eventual consistency and non-blocking patterns to improve performance and fault tolerance.
Final Answer:
Because it causes blocking, reduces availability, and hurts scalability -> Option A
Quick Check:
Blocking and low availability = avoid two-phase commit [OK]
Hint: Two-phase commit blocks and limits scalability [OK]