0
0
KafkaConceptBeginner · 3 min read

What is batch.size in Kafka Producer and How It Works

batch.size in Kafka producer is the maximum size (in bytes) of a batch of records sent to a Kafka partition. It controls how many messages the producer will group together before sending them to the broker, improving throughput by reducing network calls.
⚙️

How It Works

Imagine you are mailing letters. Instead of sending each letter individually, you put several letters in one envelope to save time and postage. batch.size works similarly for Kafka producers. It sets the maximum size of a group (batch) of messages that the producer collects before sending them to the Kafka broker.

The producer waits until the batch reaches this size or a timeout occurs, then sends all messages together. This reduces the number of network trips and improves efficiency, especially when many small messages are produced.

💻

Example

This example shows how to set batch.size in a Kafka producer configuration using Java.

java
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import java.util.Properties;

public class KafkaBatchSizeExample {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
        props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384); // 16 KB batch size

        KafkaProducer<String, String> producer = new KafkaProducer<>(props);

        for (int i = 0; i < 10; i++) {
            producer.send(new ProducerRecord<>("my-topic", "key" + i, "value" + i));
        }

        producer.close();
    }
}
Output
No direct output; messages are sent in batches up to 16 KB size to Kafka broker.
🎯

When to Use

Use batch.size to improve producer throughput when sending many small messages. Increasing the batch size lets the producer send more data in one network call, reducing overhead.

However, setting it too large can increase latency because the producer waits longer to fill the batch. For low-latency needs, keep it smaller. For high throughput and less concern about delay, increase it.

Typical use cases include log aggregation, metrics collection, or any scenario where many small messages are produced rapidly.

Key Points

  • batch.size is the max size in bytes for a batch of messages.
  • It helps reduce network calls by grouping messages.
  • Too large batch size can increase latency.
  • Adjust based on throughput vs latency needs.
  • Works together with linger.ms to control batching behavior.

Key Takeaways

batch.size controls the max bytes of messages sent together by the Kafka producer.
Larger batch sizes improve throughput by reducing network calls but may increase latency.
Set batch.size based on your application's speed and delay requirements.
It works with linger.ms to decide when batches are sent.
Typical values range from a few KB to tens of KB depending on message size and volume.