0
0
KafkaConceptBeginner · 3 min read

What Is a Compacted Topic in Kafka and How It Works

A compacted topic in Kafka is a special type of topic that keeps only the latest value for each unique key, removing older duplicates. This helps save storage and ensures you always have the most recent update for each key.
⚙️

How It Works

Imagine a notebook where you write updates about different friends. Instead of keeping every single note, you erase old notes about a friend and keep only the latest one. A compacted topic works similarly in Kafka. It stores messages with keys and keeps only the newest message for each key, deleting older ones.

This process is called log compaction. Kafka scans the topic and removes older messages with the same key, so the topic always has the latest state per key. This is useful when you want to keep a snapshot of data, like user profiles or settings, without storing every change.

💻

Example

This example shows how to create a compacted topic using Kafka's command line tool and produce messages with keys. Then, it shows how Kafka keeps only the latest message per key.
bash
bin/kafka-topics.sh --create --topic user-updates --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1 --config cleanup.policy=compact

bin/kafka-console-producer.sh --topic user-updates --bootstrap-server localhost:9092 --property parse.key=true --property key.separator=:
user1:Alice
user2:Bob
user1:Alicia
user3:Charlie

bin/kafka-console-consumer.sh --topic user-updates --bootstrap-server localhost:9092 --from-beginning --property print.key=true --timeout-ms 5000
Output
user1 Alicia user2 Bob user3 Charlie
🎯

When to Use

Use compacted topics when you need to keep the latest state of data identified by keys, such as user profiles, configurations, or inventory levels. They are great for systems that need to recover state quickly or synchronize data without replaying all changes.

For example, a user profile service can use a compacted topic to store the latest profile info per user. If the service restarts, it reads the compacted topic to get the current state without processing every update.

Key Points

  • Compacted topics keep only the latest message per key.
  • They help save storage by removing old duplicates.
  • Useful for state recovery and data synchronization.
  • Configured by setting cleanup.policy=compact.

Key Takeaways

A compacted topic stores only the latest message for each key, removing older duplicates.
It is configured by setting the topic's cleanup policy to 'compact'.
Compacted topics are ideal for keeping current state data like user profiles or configs.
They help systems recover state quickly without replaying all messages.
Use compacted topics when you want efficient storage and fast state synchronization.