DBMS Theoryknowledge~6 mins

Multi-version concurrency control (MVCC) in DBMS Theory - Full Explanation

Choose your learning style9 modes available

Learn Why Deep Visual Practice Challenge Project Recall Time

Introduction

Imagine many people trying to read and write in a shared notebook at the same time. Without a good system, their notes could get mixed up or lost. Multi-version concurrency control helps databases handle many users working together without conflicts or delays.

Explanation

Multiple Versions of Data

Instead of changing data directly, MVCC keeps several versions of the same data. Each change creates a new version, so readers can see a stable snapshot without waiting for writes to finish.

MVCC stores multiple versions of data to allow simultaneous reads and writes without blocking.

Snapshot Isolation

When a user reads data, they see a consistent snapshot from a specific point in time. This means their view does not change even if others update the data later, preventing confusion from seeing partial changes.

Snapshot isolation ensures each user sees a stable view of data during their transaction.

Non-blocking Reads

Readers do not block writers and writers do not block readers. Because readers access old versions, they never have to wait for ongoing writes to complete, improving performance and user experience.

MVCC allows reads and writes to happen at the same time without waiting.

Garbage Collection

Old versions of data that are no longer needed are cleaned up by the system. This process frees storage space and keeps the database efficient over time.

Unused old data versions are removed to maintain database efficiency.

Real World Analogy

Imagine a library where every time a book is updated, a new copy is made instead of changing the original. Readers can borrow the copy they want without waiting for updates, and old copies are recycled when no longer needed.

Multiple Versions of Data → Making new copies of books instead of changing the original

Snapshot Isolation → Readers borrowing a specific copy that doesn't change while they read

Non-blocking Reads → Readers and writers using different copies so they don't wait for each other

Garbage Collection → Recycling old book copies that no one is using anymore

Diagram

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Version 1     │──────▶│ Version 2     │──────▶│ Version 3     │
│ (Old Data)    │       │ (Updated)     │       │ (Latest)      │
└───────────────┘       └───────────────┘       └───────────────┘
       ▲                      ▲                      ▲
       │                      │                      │
Reader sees           Reader sees           Writer creates
Version 1 snapshot    Version 2 snapshot    Version 3 snapshot

Garbage collection removes versions no longer in use.

This diagram shows how multiple versions of data exist over time, with readers accessing stable snapshots and writers creating new versions.

Key Facts

MVCC → A method that keeps multiple versions of data to allow concurrent access without locking.

Snapshot Isolation → A technique where each transaction sees a consistent snapshot of the database at a point in time.

Non-blocking Reads → Reads that do not wait for writes to finish because they access older data versions.

Garbage Collection → The process of removing old data versions that are no longer needed.

Code Example

DBMS Theory

import sqlite3

conn = sqlite3.connect(':memory:')
conn.isolation_level = None  # Autocommit mode for manual control
cur = conn.cursor()

# Create table
cur.execute('CREATE TABLE items (id INTEGER PRIMARY KEY, value TEXT)')

# Insert initial data
cur.execute("INSERT INTO items (value) VALUES ('Version 1')")

# Start transaction 1 (reader)
cur.execute('BEGIN')
cur.execute('SELECT * FROM items')
print('Transaction 1 reads:', cur.fetchall())

# Start transaction 2 (writer)
cur2 = conn.cursor()
cur2.execute('BEGIN')
cur2.execute("UPDATE items SET value = 'Version 2' WHERE id = 1")

# Transaction 1 reads again (still sees old data)
cur.execute('SELECT * FROM items')
print('Transaction 1 reads again:', cur.fetchall())

# Commit transaction 2
cur2.execute('COMMIT')

# Transaction 1 reads after commit (still old snapshot)
cur.execute('SELECT * FROM items')
print('Transaction 1 reads after commit:', cur.fetchall())

# Commit transaction 1
cur.execute('COMMIT')

# New transaction reads latest data
cur.execute('BEGIN')
cur.execute('SELECT * FROM items')
print('New transaction reads:', cur.fetchall())
cur.execute('COMMIT')

OutputSuccess

Common Confusions

MVCC means no conflicts ever happen.

MVCC means no conflicts ever happen. MVCC reduces conflicts but does not eliminate them; write-write conflicts still require resolution.

Readers always see the very latest data.

Readers always see the very latest data. Readers see a snapshot from when their transaction started, not the most recent changes made after.

Summary

MVCC allows multiple users to read and write data at the same time by keeping several versions of the data.

Each user sees a consistent snapshot of the data, preventing delays and conflicts during concurrent access.

Old versions are cleaned up automatically to keep the database efficient.