Repeatable read behavior in PostgreSQL - Time & Space Complexity
When using repeatable read in PostgreSQL, we want to understand how the cost of keeping data consistent grows as more data is involved.
We ask: How does the system handle repeated reads without changes slipping in, and what does that cost as data grows?
Analyze the time complexity of this transaction using repeatable read isolation.
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT * FROM orders WHERE customer_id = 123;
-- some processing here
SELECT * FROM orders WHERE customer_id = 123;
COMMIT;
This code reads the same customer's orders twice in one transaction, expecting the data not to change between reads.
Look at what repeats during the transaction.
- Primary operation: Reading rows matching customer_id twice.
- How many times: Twice, but the system must ensure the data is stable between reads.
As the number of orders for the customer grows, the work to read and keep track of these rows grows too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Reading 10 rows twice, tracking 10 rows for consistency |
| 100 | Reading 100 rows twice, tracking 100 rows for consistency |
| 1000 | Reading 1000 rows twice, tracking 1000 rows for consistency |
Pattern observation: The work grows roughly in proportion to the number of rows read, because each row must be checked to stay consistent.
Time Complexity: O(n)
This means the time to keep reads repeatable grows linearly with the number of rows involved.
[X] Wrong: "Repeatable read means the database just locks the whole table, so time doesn't depend on rows."
[OK] Correct: PostgreSQL uses row-level checks, not full table locks, so the cost depends on how many rows are read and tracked.
Understanding how repeatable read scales helps you explain transaction behavior clearly and shows you grasp how databases keep data consistent under the hood.
"What if we changed the isolation level to serializable? How would the time complexity of repeated reads change?"