0
0
Kafkadevops~5 mins

Compression (gzip, snappy, lz4) in Kafka - Commands & Configuration

Choose your learning style9 modes available
Introduction
Compression reduces the size of data sent between Kafka producers and brokers. It helps save network bandwidth and storage space, making Kafka more efficient and faster.
When you want to reduce network usage between your Kafka producers and brokers to save costs.
When your Kafka topics handle large messages and you want to store them efficiently on disk.
When you want to improve throughput by sending smaller compressed messages over the network.
When you want to balance CPU usage and compression speed by choosing the right compression type.
When you want to ensure compatibility with Kafka clients and brokers by using supported compression codecs.
Config File - producer.properties
producer.properties
bootstrap.servers=localhost:9092
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=org.apache.kafka.common.serialization.StringSerializer
compression.type=snappy

This configuration file sets up a Kafka producer.

bootstrap.servers tells the producer where the Kafka broker is.

key.serializer and value.serializer convert data to bytes.

compression.type sets the compression codec to use (here, snappy).

Commands
Starts a Kafka console producer that sends messages to 'example-topic' using gzip compression to reduce message size.
Terminal
kafka-console-producer --broker-list localhost:9092 --topic example-topic --compression-codec gzip
Expected OutputExpected
No output (command runs silently)
--broker-list - Specifies the Kafka broker address to connect to.
--topic - Specifies the Kafka topic to send messages to.
--compression-codec - Sets the compression codec for messages sent by this producer.
Starts a Kafka console consumer to read all messages from 'example-topic' from the beginning, showing that compressed messages are decompressed automatically.
Terminal
kafka-console-consumer --bootstrap-server localhost:9092 --topic example-topic --from-beginning
Expected OutputExpected
Hello Kafka Compressed message example
--bootstrap-server - Specifies the Kafka broker address to connect to.
--topic - Specifies the Kafka topic to read messages from.
--from-beginning - Reads all messages from the start of the topic.
Shows details about 'example-topic' including partition count and replication, confirming the topic exists and is ready for compressed messages.
Terminal
kafka-topics --bootstrap-server localhost:9092 --describe --topic example-topic
Expected OutputExpected
Topic: example-topic PartitionCount: 1 ReplicationFactor: 1 Configs: Partition: 0 Leader: 1 Replicas: 1 Isr: 1
--bootstrap-server - Specifies the Kafka broker address.
--describe - Shows detailed information about the topic.
--topic - Specifies the topic to describe.
Key Concept

If you remember nothing else from this pattern, remember: Kafka producers compress messages before sending, and consumers automatically decompress them, saving bandwidth and storage transparently.

Common Mistakes
Setting compression.type on the broker or topic but not on the producer.
Compression is applied by the producer; if the producer does not compress, no compression happens even if the broker or topic is configured.
Always set compression.type in the producer configuration or use the --compression-codec flag with the producer command.
Using an unsupported compression codec like 'lzma' which Kafka does not support.
Kafka only supports gzip, snappy, lz4, and zstd compression codecs; unsupported codecs cause errors or fallback to no compression.
Use only supported codecs: gzip, snappy, lz4, or zstd.
Expecting compression to reduce CPU usage significantly.
Compression saves bandwidth but uses CPU to compress and decompress; some codecs are faster but less compressed, others slower but better compression.
Choose the compression codec based on your CPU and network trade-offs.
Summary
Set compression.type in the Kafka producer configuration to enable message compression.
Use kafka-console-producer with --compression-codec to send compressed messages.
Kafka consumers automatically decompress messages, so no special config is needed on the consumer side.
Check topic details with kafka-topics to confirm topic configuration and status.