MVCC mental model in PostgreSQL - Time & Space Complexity
We want to understand how PostgreSQL handles multiple users reading and writing data at the same time.
How does the system keep things fast and correct as more users work together?
Analyze the time complexity of MVCC snapshot visibility checks in PostgreSQL.
-- Simplified query to select visible rows
SELECT * FROM orders
WHERE xmin < current_snapshot_xmin
AND (xmax = 0 OR xmax > current_snapshot_xmax);
This query returns rows visible to the current transaction based on their transaction IDs.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Checking each row's transaction IDs against the current snapshot.
- How many times: Once per row scanned in the table or index scan.
As the number of rows grows, the system checks more rows to decide visibility.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 visibility checks |
| 100 | About 100 visibility checks |
| 1000 | About 1000 visibility checks |
Pattern observation: The number of visibility checks grows roughly in direct proportion to the number of rows scanned.
Time Complexity: O(n)
This means the time to check which rows are visible grows linearly with the number of rows scanned.
[X] Wrong: "MVCC makes all queries run instantly regardless of data size."
[OK] Correct: Each row still needs to be checked for visibility, so more rows mean more work and longer time.
Understanding how MVCC scales helps you explain how databases handle many users smoothly, a key skill in database design and troubleshooting.
"What if we added an index to help skip invisible rows? How would the time complexity change?"