Consumer throughput optimization in Kafka - Time & Space Complexity
When working with Kafka consumers, it is important to understand how the speed of processing messages changes as the number of messages grows.
We want to know how the time to consume messages changes when we increase the message load.
Analyze the time complexity of the following Kafka consumer code snippet.
consumer.subscribe(Collections.singletonList("topic"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
process(record);
}
consumer.commitSync();
}
This code subscribes to a topic, polls messages in batches, processes each message, and commits offsets synchronously.
Look at what repeats as the consumer runs.
- Primary operation: Loop over all messages in each batch to process them.
- How many times: Once per message in each batch, repeated continuously as new messages arrive.
As the number of messages increases, the consumer processes more messages each time it polls.
| Input Size (messages per batch) | Approx. Operations (processing each message) |
|---|---|
| 10 | 10 processing calls |
| 100 | 100 processing calls |
| 1000 | 1000 processing calls |
Pattern observation: The work grows directly with the number of messages in each batch.
Time Complexity: O(n)
This means the time to process messages grows linearly with the number of messages received.
[X] Wrong: "Processing messages in batches makes the time constant regardless of batch size."
[OK] Correct: Even in batches, each message still needs processing, so time grows with batch size.
Understanding how message processing time grows helps you design efficient consumers and shows you can reason about real-world streaming systems.
"What if we processed messages asynchronously instead of one by one? How would the time complexity change?"