Bird
Raised Fist0
DBMS Theoryknowledge~6 mins

Distributed transactions and 2PC in DBMS Theory - Full Explanation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Imagine you need to update information stored in different places at the same time, but you want to make sure all updates happen together or none at all. This problem arises in systems where data is spread across multiple databases or servers. Distributed transactions and the Two-Phase Commit protocol help solve this challenge by coordinating these updates safely.
Explanation
Distributed Transactions
Distributed transactions involve operations that span multiple separate databases or systems. They ensure that all parts of the transaction either complete successfully or all fail, maintaining data consistency across all locations. This is important because partial updates can cause errors or data corruption.
Distributed transactions guarantee that multiple systems update data together as a single unit.
Two-Phase Commit (2PC) Protocol
The Two-Phase Commit protocol is a method to coordinate distributed transactions. It works in two steps: first, a prepare phase where all systems agree they can commit; second, a commit phase where the actual changes are made. If any system cannot commit, all changes are rolled back to keep data consistent.
2PC ensures all systems agree before making permanent changes, preventing partial updates.
Prepare Phase
In the prepare phase, the coordinator asks all participating systems if they are ready to commit the transaction. Each system checks if it can complete the operation and replies with a yes or no. This phase is crucial to detect any problems before making changes permanent.
Prepare phase checks readiness of all systems to commit before proceeding.
Commit Phase
If all systems respond positively in the prepare phase, the coordinator sends a commit command to all participants. Each system then makes the changes permanent. If any system votes no, the coordinator sends a rollback command to undo any changes, ensuring no partial updates remain.
Commit phase finalizes changes only if all systems agree; otherwise, it cancels the transaction.
Real World Analogy

Imagine a group of friends deciding to buy a gift together. First, they all check if they have enough money and agree to contribute. Only if everyone agrees do they buy the gift. If even one friend cannot pay, they all decide not to buy it to avoid problems.

Distributed Transactions → Friends needing to pool money from different wallets to buy one gift
Two-Phase Commit (2PC) Protocol → The two-step process of checking agreement first, then buying the gift
Prepare Phase → Friends asking each other if they can pay their share
Commit Phase → Actually buying the gift only if everyone agrees, or canceling if not
Diagram
Diagram
┌───────────────┐        ┌───────────────┐        ┌───────────────┐
│ Coordinator   │        │ Participant 1 │        │ Participant 2 │
└──────┬────────┘        └──────┬────────┘        └──────┬────────┘
       │ Prepare request          │ Prepare request          │
       │────────────────────────▶│                         │
       │                         │                         │
       │                         │ Prepare OK / Fail        │
       │◀────────────────────────│                         │
       │                         │                         │
       │ Prepare request          │ Prepare request          │
       │──────────────────────────────────────────────────▶│
       │                         │                         │
       │                         │ Prepare OK / Fail        │
       │                         │◀────────────────────────│
       │                         │                         │
       │ If all OK, send Commit   │                         │
       │────────────────────────▶│                         │
       │                         │ Commit                   │
       │                         │                         │
       │                         │ Commit                   │
       │──────────────────────────────────────────────────▶│
       │                         │                         │
       │                         │ Commit                   │
       │                         │                         │
This diagram shows the coordinator sending prepare requests to participants, collecting their responses, and then sending commit commands if all agree.
Key Facts
Distributed TransactionA transaction that involves multiple separate databases or systems working together.
Two-Phase Commit (2PC)A protocol that coordinates distributed transactions in two steps: prepare and commit.
Prepare PhaseThe first step in 2PC where participants confirm readiness to commit.
Commit PhaseThe second step in 2PC where participants finalize or abort the transaction.
CoordinatorThe component that manages the 2PC process by communicating with participants.
Code Example
DBMS Theory
class Participant:
    def __init__(self, name):
        self.name = name
        self.prepared = False

    def prepare(self):
        # Simulate readiness check
        print(f"{self.name}: Preparing transaction")
        self.prepared = True
        return True

    def commit(self):
        if self.prepared:
            print(f"{self.name}: Committing transaction")
            self.prepared = False
        else:
            print(f"{self.name}: Cannot commit, not prepared")

    def rollback(self):
        print(f"{self.name}: Rolling back transaction")
        self.prepared = False

class Coordinator:
    def __init__(self, participants):
        self.participants = participants

    def two_phase_commit(self):
        print("Coordinator: Starting prepare phase")
        votes = [p.prepare() for p in self.participants]
        if all(votes):
            print("Coordinator: All participants prepared, committing")
            for p in self.participants:
                p.commit()
        else:
            print("Coordinator: Prepare failed, rolling back")
            for p in self.participants:
                p.rollback()

p1 = Participant("DB1")
p2 = Participant("DB2")
coord = Coordinator([p1, p2])
coord.two_phase_commit()
OutputSuccess
Common Confusions
Believing that 2PC guarantees no delays or failures.
Believing that 2PC guarantees no delays or failures. 2PC can cause delays or block if a participant crashes during the process, as it waits for all responses before proceeding.
Thinking that participants commit changes during the prepare phase.
Thinking that participants commit changes during the prepare phase. During prepare, participants only promise to commit; actual changes happen only in the commit phase.
Summary
Distributed transactions ensure multiple systems update data together to keep information consistent.
The Two-Phase Commit protocol coordinates these updates in two steps: prepare and commit.
If any system cannot commit, the entire transaction is rolled back to avoid partial changes.

Practice

(1/5)
1. What is the main purpose of the Two-Phase Commit (2PC) protocol in distributed transactions?
easy
A. To ensure all participating systems agree to commit or abort a transaction
B. To speed up transaction processing by skipping checks
C. To allow partial commits in case of failures
D. To encrypt data during transaction processing

Solution

  1. Step 1: Understand the role of 2PC in distributed systems

    2PC coordinates multiple systems to either all commit or all abort a transaction, ensuring consistency.
  2. Step 2: Analyze the options

    Only To ensure all participating systems agree to commit or abort a transaction correctly describes 2PC's goal of agreement before finalizing changes.
  3. Final Answer:

    To ensure all participating systems agree to commit or abort a transaction -> Option A
  4. Quick Check:

    2PC ensures agreement = To ensure all participating systems agree to commit or abort a transaction [OK]
Hint: 2PC means all or nothing commit agreement [OK]
Common Mistakes:
  • Thinking 2PC speeds up transactions
  • Believing partial commits are allowed
  • Confusing 2PC with encryption
2. Which of the following is the correct sequence of phases in the Two-Phase Commit protocol?
easy
A. Commit phase followed by Prepare phase
B. Commit phase only
C. Abort phase followed by Prepare phase
D. Prepare phase followed by Commit phase

Solution

  1. Step 1: Recall the 2PC phases

    The protocol first asks participants to prepare (vote), then commits if all agree.
  2. Step 2: Match phases to options

    Prepare phase followed by Commit phase correctly lists Prepare phase first, then Commit phase.
  3. Final Answer:

    Prepare phase followed by Commit phase -> Option D
  4. Quick Check:

    2PC phases = Prepare then Commit [OK]
Hint: Prepare before commit in 2PC sequence [OK]
Common Mistakes:
  • Reversing the order of phases
  • Ignoring the Prepare phase
  • Thinking Commit happens alone
3. Consider a distributed transaction using 2PC with three participants: P1, P2, and P3. If P1 and P2 vote to commit but P3 votes to abort during the Prepare phase, what will be the final outcome?
medium
A. Only P1 and P2 commit, P3 aborts
B. All participants abort the transaction
C. All participants commit the transaction
D. Transaction is left in uncertain state

Solution

  1. Step 1: Understand voting in 2PC Prepare phase

    All participants must vote to commit for the transaction to proceed; any abort vote causes abort.
  2. Step 2: Apply voting results

    Since P3 votes to abort, the coordinator instructs all to abort to keep data consistent.
  3. Final Answer:

    All participants abort the transaction -> Option B
  4. Quick Check:

    Any abort vote = all abort [OK]
Hint: One abort vote cancels entire transaction [OK]
Common Mistakes:
  • Assuming partial commits are allowed
  • Thinking transaction stays uncertain
  • Ignoring abort votes
4. A distributed transaction using 2PC is stuck indefinitely in the Commit phase. What is the most likely cause of this problem?
medium
A. A participant failed to send its vote during the Prepare phase
B. All participants voted to abort during Prepare phase
C. The coordinator crashed after sending Commit messages but before receiving acknowledgments
D. The transaction was never started

Solution

  1. Step 1: Identify causes of blocking in Commit phase

    If the coordinator crashes after sending Commit but before acknowledgments, participants wait indefinitely.
  2. Step 2: Analyze options

    The coordinator crashed after sending Commit messages but before receiving acknowledgments matches this scenario; other options relate to earlier phases or no transaction.
  3. Final Answer:

    The coordinator crashed after sending Commit messages but before receiving acknowledgments -> Option C
  4. Quick Check:

    Coordinator crash during Commit causes blocking [OK]
Hint: Coordinator crash after commit message causes blocking [OK]
Common Mistakes:
  • Confusing Prepare phase failures with Commit phase blocking
  • Assuming abort votes cause commit blocking
  • Ignoring coordinator role
5. In a distributed system using 2PC, how can the protocol be improved to avoid the blocking problem caused by coordinator failure during the Commit phase?
hard
A. Use a Three-Phase Commit protocol that adds a pre-commit phase
B. Skip the Prepare phase to speed up commits
C. Allow participants to commit independently without coordinator
D. Increase the timeout for participant responses

Solution

  1. Step 1: Understand blocking in 2PC

    2PC can block if coordinator fails after sending commit but before acknowledgments.
  2. Step 2: Identify protocol improvements

    Three-Phase Commit adds a pre-commit phase to reduce blocking by ensuring participants can safely decide without coordinator.
  3. Final Answer:

    Use a Three-Phase Commit protocol that adds a pre-commit phase -> Option A
  4. Quick Check:

    3PC adds pre-commit to avoid blocking [OK]
Hint: 3PC adds pre-commit phase to prevent blocking [OK]
Common Mistakes:
  • Skipping Prepare phase breaks consistency
  • Allowing independent commits causes inconsistency
  • Just increasing timeout doesn't fix blocking