What is auto.offset.reset in Kafka: Explanation and Usage
auto.offset.reset in Kafka is a consumer configuration that tells Kafka what to do when there is no initial offset or if the current offset is invalid. It controls whether the consumer starts reading from the earliest message, the latest message, or fails with an error.How It Works
Imagine you are reading a book, but you lost your bookmark. You need to decide whether to start reading from the beginning or from the latest page. auto.offset.reset works like that bookmark decision for Kafka consumers.
When a Kafka consumer starts, it tries to continue reading from where it left off using stored offsets. If no offset is found or the offset is out of range (maybe the data was deleted), Kafka uses auto.offset.reset to decide where to start.
The common options are earliest (start from the oldest message), latest (start from new messages only), or none (throw an exception if no offset is found).
Example
This example shows how to set auto.offset.reset in a Kafka consumer configuration using Java.
Properties props = new Properties(); props.put("bootstrap.servers", "localhost:9092"); props.put("group.id", "my-group"); props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); props.put("auto.offset.reset", "earliest"); KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props); consumer.subscribe(Arrays.asList("my-topic")); while (true) { ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100)); for (ConsumerRecord<String, String> record : records) { System.out.printf("offset = %d, key = %s, value = %s\n", record.offset(), record.key(), record.value()); } }
When to Use
Use auto.offset.reset=earliest when you want your consumer to read all existing messages from the start if no offset is found. This is useful for new consumers or when you want to reprocess data.
Use auto.offset.reset=latest when you only want to consume new messages arriving after the consumer starts, ignoring old data.
Use auto.offset.reset=none to make sure your application fails if offsets are missing, so you can handle the situation explicitly.
For example, a log processing system might use earliest to process all logs, while a real-time alert system might use latest to only react to new events.
Key Points
- auto.offset.reset controls where to start reading if no valid offset exists.
- Common values:
earliest,latest, andnone. - Choosing the right value depends on whether you want to process old data or only new data.
- It helps avoid errors or data loss when offsets are missing or invalid.
Key Takeaways
auto.offset.reset defines where Kafka consumers start reading if no offset is found.earliest to read all existing messages from the beginning.latest to read only new messages arriving after the consumer starts.none to fail if no offset is found, forcing manual handling.