In-sync replicas (ISR) in Kafka - Time & Space Complexity
When working with Kafka's In-sync replicas (ISR), it's important to understand how the system keeps track of replicas that are up-to-date.
We want to know how the time to check and update ISR changes as the number of replicas grows.
Analyze the time complexity of the following Kafka ISR update logic.
// Pseudocode for ISR update
for each replica in replicasList {
if (replica.lastOffset == leader.lastOffset) {
add replica to ISR
} else {
remove replica from ISR
}
}
This code checks each replica's offset against the leader's offset to update the ISR list.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping through all replicas to compare offsets.
- How many times: Once per replica in the replicas list.
The time to update ISR grows directly with the number of replicas.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 comparisons |
| 100 | 100 comparisons |
| 1000 | 1000 comparisons |
Pattern observation: Doubling replicas roughly doubles the work needed.
Time Complexity: O(n)
This means the time to update ISR grows linearly with the number of replicas.
[X] Wrong: "Checking ISR is constant time no matter how many replicas there are."
[OK] Correct: Each replica must be checked individually, so more replicas mean more work.
Understanding how ISR updates scale helps you explain Kafka's reliability and performance in real systems.
"What if we only checked replicas that recently changed offset? How would the time complexity change?"