CAP theorem in DBMS Theory - Time & Space Complexity
When studying the CAP theorem, it is important to understand how the time for read and write operations changes based on the consistency model and network conditions in a distributed database.
We want to know how replication and consensus protocols scale as the number of nodes increases.
Analyze the time complexity of write replication across nodes in a distributed database.
for each write_operation:
write to primary_node -- O(1)
for each replica_node in replicas:
send replication message -- O(1) per node
wait for acknowledgment (if CP) -- O(latency)
confirm write to client
This code processes a write by first writing to the primary node, then replicating to all replicas. In a CP system, it waits for acknowledgments before confirming.
Look at the loops and repeated operations in the replication process.
- Primary operation: Sending a replication message and waiting for acknowledgment from each replica node.
- How many times: Once per replica node for every write operation.
As the number of replica nodes grows, the replication work increases.
| Replicas (n) | CP System (wait all) | AP System (async) |
|---|---|---|
| 3 nodes | 3 acknowledgments waited | Fire and forget |
| 5 nodes | 5 acknowledgments waited | Fire and forget |
| 10 nodes | 10 acknowledgments waited | Fire and forget |
Pattern observation: CP systems have write latency proportional to the number of replicas (or quorum size). AP systems have constant write latency since they do not wait for acknowledgments.
CP System Write: O(n) where n is the number of replicas (or O(quorum) with majority-based consensus)
AP System Write: O(1) — write is confirmed immediately after the primary node accepts it
This trade-off is the core of the CAP theorem: consistency costs latency, availability gains speed.
[X] Wrong: "Replication is always instant because networks are fast."
[OK] Correct: Network latency and partitions are real. In a CP system, a single slow or unreachable replica can delay the entire write operation. This is why AP systems sacrifice consistency for predictable response times.
Understanding the latency trade-offs of CP vs AP systems helps you explain database selection decisions in system design interviews. Knowing that consistency requires coordination time (O(n) or O(quorum)) is a key insight.
If a distributed database uses quorum-based replication (majority must acknowledge), how does the write latency change compared to waiting for all replicas?