0
0
DBMS Theoryknowledge~6 mins

Distributed transactions and 2PC in DBMS Theory - Full Explanation

Choose your learning style9 modes available
Introduction
Imagine you need to update information stored in different places at the same time, but you want to make sure all updates happen together or none at all. This problem arises in systems where data is spread across multiple databases or servers. Distributed transactions and the Two-Phase Commit protocol help solve this challenge by coordinating these updates safely.
Explanation
Distributed Transactions
Distributed transactions involve operations that span multiple separate databases or systems. They ensure that all parts of the transaction either complete successfully or all fail, maintaining data consistency across all locations. This is important because partial updates can cause errors or data corruption.
Distributed transactions guarantee that multiple systems update data together as a single unit.
Two-Phase Commit (2PC) Protocol
The Two-Phase Commit protocol is a method to coordinate distributed transactions. It works in two steps: first, a prepare phase where all systems agree they can commit; second, a commit phase where the actual changes are made. If any system cannot commit, all changes are rolled back to keep data consistent.
2PC ensures all systems agree before making permanent changes, preventing partial updates.
Prepare Phase
In the prepare phase, the coordinator asks all participating systems if they are ready to commit the transaction. Each system checks if it can complete the operation and replies with a yes or no. This phase is crucial to detect any problems before making changes permanent.
Prepare phase checks readiness of all systems to commit before proceeding.
Commit Phase
If all systems respond positively in the prepare phase, the coordinator sends a commit command to all participants. Each system then makes the changes permanent. If any system votes no, the coordinator sends a rollback command to undo any changes, ensuring no partial updates remain.
Commit phase finalizes changes only if all systems agree; otherwise, it cancels the transaction.
Real World Analogy

Imagine a group of friends deciding to buy a gift together. First, they all check if they have enough money and agree to contribute. Only if everyone agrees do they buy the gift. If even one friend cannot pay, they all decide not to buy it to avoid problems.

Distributed Transactions → Friends needing to pool money from different wallets to buy one gift
Two-Phase Commit (2PC) Protocol → The two-step process of checking agreement first, then buying the gift
Prepare Phase → Friends asking each other if they can pay their share
Commit Phase → Actually buying the gift only if everyone agrees, or canceling if not
Diagram
Diagram
┌───────────────┐        ┌───────────────┐        ┌───────────────┐
│ Coordinator   │        │ Participant 1 │        │ Participant 2 │
└──────┬────────┘        └──────┬────────┘        └──────┬────────┘
       │ Prepare request          │ Prepare request          │
       │────────────────────────▶│                         │
       │                         │                         │
       │                         │ Prepare OK / Fail        │
       │◀────────────────────────│                         │
       │                         │                         │
       │ Prepare request          │ Prepare request          │
       │──────────────────────────────────────────────────▶│
       │                         │                         │
       │                         │ Prepare OK / Fail        │
       │                         │◀────────────────────────│
       │                         │                         │
       │ If all OK, send Commit   │                         │
       │────────────────────────▶│                         │
       │                         │ Commit                   │
       │                         │                         │
       │                         │ Commit                   │
       │──────────────────────────────────────────────────▶│
       │                         │                         │
       │                         │ Commit                   │
       │                         │                         │
This diagram shows the coordinator sending prepare requests to participants, collecting their responses, and then sending commit commands if all agree.
Key Facts
Distributed TransactionA transaction that involves multiple separate databases or systems working together.
Two-Phase Commit (2PC)A protocol that coordinates distributed transactions in two steps: prepare and commit.
Prepare PhaseThe first step in 2PC where participants confirm readiness to commit.
Commit PhaseThe second step in 2PC where participants finalize or abort the transaction.
CoordinatorThe component that manages the 2PC process by communicating with participants.
Code Example
DBMS Theory
class Participant:
    def __init__(self, name):
        self.name = name
        self.prepared = False

    def prepare(self):
        # Simulate readiness check
        print(f"{self.name}: Preparing transaction")
        self.prepared = True
        return True

    def commit(self):
        if self.prepared:
            print(f"{self.name}: Committing transaction")
            self.prepared = False
        else:
            print(f"{self.name}: Cannot commit, not prepared")

    def rollback(self):
        print(f"{self.name}: Rolling back transaction")
        self.prepared = False

class Coordinator:
    def __init__(self, participants):
        self.participants = participants

    def two_phase_commit(self):
        print("Coordinator: Starting prepare phase")
        votes = [p.prepare() for p in self.participants]
        if all(votes):
            print("Coordinator: All participants prepared, committing")
            for p in self.participants:
                p.commit()
        else:
            print("Coordinator: Prepare failed, rolling back")
            for p in self.participants:
                p.rollback()

p1 = Participant("DB1")
p2 = Participant("DB2")
coord = Coordinator([p1, p2])
coord.two_phase_commit()
OutputSuccess
Common Confusions
Believing that 2PC guarantees no delays or failures.
Believing that 2PC guarantees no delays or failures. 2PC can cause delays or block if a participant crashes during the process, as it waits for all responses before proceeding.
Thinking that participants commit changes during the prepare phase.
Thinking that participants commit changes during the prepare phase. During prepare, participants only promise to commit; actual changes happen only in the commit phase.
Summary
Distributed transactions ensure multiple systems update data together to keep information consistent.
The Two-Phase Commit protocol coordinates these updates in two steps: prepare and commit.
If any system cannot commit, the entire transaction is rolled back to avoid partial changes.