0
0
DBMS Theoryknowledge~6 mins

Multi-version concurrency control (MVCC) in DBMS Theory - Full Explanation

Choose your learning style9 modes available
Introduction
Imagine many people trying to read and write in a shared notebook at the same time. Without a good system, their notes could get mixed up or lost. Multi-version concurrency control helps databases handle many users working together without conflicts or delays.
Explanation
Multiple Versions of Data
Instead of changing data directly, MVCC keeps several versions of the same data. Each change creates a new version, so readers can see a stable snapshot without waiting for writes to finish.
MVCC stores multiple versions of data to allow simultaneous reads and writes without blocking.
Snapshot Isolation
When a user reads data, they see a consistent snapshot from a specific point in time. This means their view does not change even if others update the data later, preventing confusion from seeing partial changes.
Snapshot isolation ensures each user sees a stable view of data during their transaction.
Non-blocking Reads
Readers do not block writers and writers do not block readers. Because readers access old versions, they never have to wait for ongoing writes to complete, improving performance and user experience.
MVCC allows reads and writes to happen at the same time without waiting.
Garbage Collection
Old versions of data that are no longer needed are cleaned up by the system. This process frees storage space and keeps the database efficient over time.
Unused old data versions are removed to maintain database efficiency.
Real World Analogy

Imagine a library where every time a book is updated, a new copy is made instead of changing the original. Readers can borrow the copy they want without waiting for updates, and old copies are recycled when no longer needed.

Multiple Versions of Data → Making new copies of books instead of changing the original
Snapshot Isolation → Readers borrowing a specific copy that doesn't change while they read
Non-blocking Reads → Readers and writers using different copies so they don't wait for each other
Garbage Collection → Recycling old book copies that no one is using anymore
Diagram
Diagram
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Version 1     │──────▶│ Version 2     │──────▶│ Version 3     │
│ (Old Data)    │       │ (Updated)     │       │ (Latest)      │
└───────────────┘       └───────────────┘       └───────────────┘
       ▲                      ▲                      ▲
       │                      │                      │
Reader sees           Reader sees           Writer creates
Version 1 snapshot    Version 2 snapshot    Version 3 snapshot

Garbage collection removes versions no longer in use.
This diagram shows how multiple versions of data exist over time, with readers accessing stable snapshots and writers creating new versions.
Key Facts
MVCCA method that keeps multiple versions of data to allow concurrent access without locking.
Snapshot IsolationA technique where each transaction sees a consistent snapshot of the database at a point in time.
Non-blocking ReadsReads that do not wait for writes to finish because they access older data versions.
Garbage CollectionThe process of removing old data versions that are no longer needed.
Code Example
DBMS Theory
import sqlite3

conn = sqlite3.connect(':memory:')
conn.isolation_level = None  # Autocommit mode for manual control
cur = conn.cursor()

# Create table
cur.execute('CREATE TABLE items (id INTEGER PRIMARY KEY, value TEXT)')

# Insert initial data
cur.execute("INSERT INTO items (value) VALUES ('Version 1')")

# Start transaction 1 (reader)
cur.execute('BEGIN')
cur.execute('SELECT * FROM items')
print('Transaction 1 reads:', cur.fetchall())

# Start transaction 2 (writer)
cur2 = conn.cursor()
cur2.execute('BEGIN')
cur2.execute("UPDATE items SET value = 'Version 2' WHERE id = 1")

# Transaction 1 reads again (still sees old data)
cur.execute('SELECT * FROM items')
print('Transaction 1 reads again:', cur.fetchall())

# Commit transaction 2
cur2.execute('COMMIT')

# Transaction 1 reads after commit (still old snapshot)
cur.execute('SELECT * FROM items')
print('Transaction 1 reads after commit:', cur.fetchall())

# Commit transaction 1
cur.execute('COMMIT')

# New transaction reads latest data
cur.execute('BEGIN')
cur.execute('SELECT * FROM items')
print('New transaction reads:', cur.fetchall())
cur.execute('COMMIT')
OutputSuccess
Common Confusions
MVCC means no conflicts ever happen.
MVCC means no conflicts ever happen. MVCC reduces conflicts but does not eliminate them; write-write conflicts still require resolution.
Readers always see the very latest data.
Readers always see the very latest data. Readers see a snapshot from when their transaction started, not the most recent changes made after.
Summary
MVCC allows multiple users to read and write data at the same time by keeping several versions of the data.
Each user sees a consistent snapshot of the data, preventing delays and conflicts during concurrent access.
Old versions are cleaned up automatically to keep the database efficient.