Delete vs Compact Cleanup Policy in Kafka: Key Differences and Usage
delete cleanup policy removes messages after a retention time or size limit, while the compact policy keeps only the latest message per key, removing older duplicates. Delete is for time-based cleanup, and compact is for maintaining the latest state per key.Quick Comparison
This table summarizes the main differences between the delete and compact cleanup policies in Kafka.
| Aspect | Delete Cleanup Policy | Compact Cleanup Policy |
|---|---|---|
| Purpose | Remove old messages after retention time or size | Keep only the latest message per key, remove duplicates |
| Data Retention | Based on time or size limits | Based on key uniqueness, no time limit |
| Use Case | Log data, event streams with expiry | State stores, changelogs, latest updates |
| Message Removal | Deletes messages older than retention | Deletes older messages with same key |
| Data Integrity | May lose data after retention | Always keeps latest state per key |
| Performance Impact | Simple deletion, less CPU | More CPU for compaction process |
Key Differences
The delete cleanup policy in Kafka removes messages based on a configured retention time or size limit. When messages exceed these limits, Kafka deletes them to free up space. This policy is useful when you want to keep data only for a limited time, such as logs or event streams that become irrelevant after some time.
On the other hand, the compact cleanup policy focuses on keeping the latest message for each unique key. Kafka scans the log and removes older messages with the same key, ensuring that only the most recent update per key remains. This is ideal for use cases like maintaining a current state or changelog, where you want to always have the latest information without duplicates.
While delete is time or size-driven, compact is key-driven. Compacting requires more CPU resources because Kafka must track keys and perform background compaction. Delete is simpler and faster but may lose data after retention expires. Compact ensures data integrity per key but may keep data indefinitely if keys keep updating.
Delete Cleanup Policy Example
This example shows how to configure a Kafka topic with the delete cleanup policy and produce messages.
bin/kafka-topics.sh --create --topic delete-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1 --config cleanup.policy=delete --config retention.ms=60000 bin/kafka-console-producer.sh --topic delete-topic --bootstrap-server localhost:9092 key1:message1 key2:message2 key1:message3
Compact Cleanup Policy Equivalent
This example shows how to configure a Kafka topic with the compact cleanup policy and produce messages with keys.
bin/kafka-topics.sh --create --topic compact-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1 --config cleanup.policy=compact bin/kafka-console-producer.sh --topic compact-topic --bootstrap-server localhost:9092 --property parse.key=true --property key.separator=: key1:message1 key2:message2 key1:message3
When to Use Which
Choose the delete cleanup policy when you want to remove old data after a certain time or size, such as logs or event streams that do not need to be kept forever. It is simple and efficient for time-based retention.
Choose the compact cleanup policy when you need to maintain the latest state per key, like in changelog topics or state stores. It ensures you always have the most recent update for each key, which is critical for stateful stream processing.
In some cases, you can combine both policies (compact,delete) to keep the latest state but also remove data older than a retention period.