What is Offset in Kafka: Simple Explanation and Usage
offset is a unique number that identifies the position of a message within a partition of a topic. It acts like a bookmark, letting consumers know which messages they have already read and where to continue reading next.How It Works
Think of a Kafka topic partition as a long row of mailboxes, each holding a message. The offset is like the mailbox number that tells you exactly where a message is located. When a consumer reads messages, it keeps track of the last offset it processed, so it knows where to pick up next time.
This system helps Kafka manage message delivery efficiently. Since each message has a unique offset, consumers can read messages at their own pace without losing track or reading duplicates. Itβs like reading a book with page numbers; you always know where you left off.
Example
This example shows how to get and commit offsets using Kafka's Java client API. The consumer reads messages and commits the offset after processing each one.
import org.apache.kafka.clients.consumer.ConsumerRecord; import org.apache.kafka.clients.consumer.ConsumerRecords; import org.apache.kafka.clients.consumer.KafkaConsumer; import java.time.Duration; import java.util.Collections; import java.util.Properties; public class OffsetExample { public static void main(String[] args) { Properties props = new Properties(); props.put("bootstrap.servers", "localhost:9092"); props.put("group.id", "example-group"); props.put("enable.auto.commit", "false"); props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props)) { consumer.subscribe(Collections.singletonList("my-topic")); while (true) { ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100)); for (ConsumerRecord<String, String> record : records) { System.out.printf("Received message: key = %s, value = %s, offset = %d\n", record.key(), record.value(), record.offset()); // Process message here // Commit offset after processing consumer.commitSync(Collections.singletonMap(record.topicPartition(), new org.apache.kafka.clients.consumer.OffsetAndMetadata(record.offset() + 1))); } } } } }
When to Use
Offsets are essential whenever you want to track which messages a consumer has processed in Kafka. Use offsets to:
- Resume reading messages after a restart without missing or repeating data.
- Manage multiple consumers in a group so each reads a unique set of messages.
- Replay messages by resetting offsets to an earlier position.
For example, if you build a system that processes orders from Kafka, offsets help ensure each order is processed exactly once, even if your service restarts or scales.
Key Points
- Offset is a unique number for each message in a Kafka partition.
- Consumers use offsets to track their reading position.
- Offsets enable reliable message processing and replay.
- Offsets can be committed automatically or manually by consumers.