0
0
Kafkadevops~7 mins

Batch size and compression tuning in Kafka - Commands & Configuration

Choose your learning style9 modes available
Introduction
Sending messages efficiently in Kafka helps reduce network use and speeds up processing. Batch size and compression settings control how many messages are sent together and how much data is shrunk before sending.
When you want to reduce the number of network calls by sending many messages at once.
When you need to lower the amount of data sent over the network to save bandwidth.
When your Kafka producer is slow because it sends messages one by one.
When you want to balance between latency and throughput in your Kafka messaging.
When you want to reduce storage space used by Kafka brokers by compressing messages.
Config File - producer.properties
producer.properties
batch.size=32768
compression.type=snappy
linger.ms=5

batch.size: Sets the maximum size (in bytes) of a batch of messages sent to a Kafka partition. Larger batches improve throughput but may increase latency.

compression.type: Defines the compression algorithm used to shrink message data before sending. 'snappy' is fast and balances speed with compression.

linger.ms: Time to wait before sending a batch, allowing more messages to accumulate for better batching.

Commands
Starts a Kafka producer using the configuration file to apply batch size and compression settings for sending messages efficiently.
Terminal
kafka-console-producer --broker-list localhost:9092 --topic example-topic --producer.config producer.properties
Expected OutputExpected
> This command opens a prompt to type messages. No output until messages are sent.
--producer.config - Specifies the configuration file with batch size and compression settings.
--broker-list - Defines the Kafka broker addresses to connect.
--topic - Specifies the topic to send messages to.
Updates the Kafka broker configuration to use 'snappy' compression for all messages handled by broker 0, reducing data size on the server side.
Terminal
kafka-configs --bootstrap-server localhost:9092 --entity-type brokers --entity-name 0 --alter --add-config compression.type=snappy
Expected OutputExpected
Updated config for broker 0.
--alter - Indicates that the configuration is being changed.
--add-config - Adds or updates the specified configuration key and value.
Checks the current configuration and status of the topic to verify that messages are being handled with the new batch and compression settings.
Terminal
kafka-topics --bootstrap-server localhost:9092 --describe --topic example-topic
Expected OutputExpected
Topic: example-topic PartitionCount: 1 ReplicationFactor: 1 Configs: compression.type=snappy Topic: example-topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
--describe - Shows detailed information about the topic.
Key Concept

If you remember nothing else from this pattern, remember: tuning batch size and compression together helps send messages faster and use less network and storage.

Common Mistakes
Setting batch.size too large without adjusting linger.ms
Messages wait too long to fill the batch, causing high latency.
Set linger.ms to a small value like 5 ms to balance latency and batching.
Using compression.type 'none' or unsupported compression
No data size reduction happens, wasting bandwidth and storage.
Use supported compression types like 'snappy', 'gzip', or 'lz4' for better efficiency.
Changing producer settings but not verifying broker or topic configs
Broker or topic may override or ignore producer compression settings.
Check and align broker and topic compression settings with producer configuration.
Summary
Configure batch.size and compression.type in the producer properties file to control message batching and compression.
Use kafka-console-producer with the config file to send messages efficiently.
Update broker compression settings with kafka-configs to ensure consistent compression.
Verify topic configuration with kafka-topics to confirm settings are applied.