Batch size and compression tuning in Kafka - Time & Space Complexity
When tuning batch size and compression in Kafka, we want to understand how these settings affect the time it takes to send messages.
We ask: How does changing batch size or compression change the work Kafka does as message volume grows?
Analyze the time complexity of sending messages with batch size and compression settings.
producerConfig.put("batch.size", 16384);
producerConfig.put("compression.type", "gzip");
for (int i = 0; i < numMessages; i++) {
producer.send(new ProducerRecord(topic, key, value));
}
producer.flush();
This code sends many messages, grouping them in batches and compressing each batch before sending.
Look at what repeats as messages increase.
- Primary operation: Sending messages in batches inside the loop.
- How many times: The loop runs once per message, but actual network sends happen per batch.
As the number of messages grows, the number of batches grows slower because each batch holds many messages.
| Input Size (n) | Approx. Operations (batches sent) |
|---|---|
| 10 | 1 |
| 100 | 1 |
| 1000 | 7 |
Pattern observation: Operations grow roughly proportional to the number of batches, which is input size divided by batch size.
Time Complexity: O(n / b)
This means the work grows with the number of messages divided by batch size, so bigger batches reduce the number of send operations.
[X] Wrong: "Increasing batch size always makes sending faster because fewer batches are sent."
[OK] Correct: Larger batches can increase compression time and memory use, which may slow down processing despite fewer sends.
Understanding how batch size and compression affect performance shows you can balance speed and resource use, a key skill in real-world Kafka tuning.
What if we changed compression type from gzip to none? How would the time complexity change?