0
0
KafkaHow-ToBeginner · 4 min read

How to Use max.poll.records in Kafka Consumer Configuration

Use the max.poll.records setting in Kafka consumer configuration to limit the maximum number of records returned by each poll() call. Set it by adding max.poll.records with an integer value in your consumer properties to control batch size for processing.
📐

Syntax

The max.poll.records configuration is set as a key-value pair in the Kafka consumer properties. It defines the maximum number of records returned in one call to poll().

Example parts:

  • max.poll.records: The configuration key.
  • Integer value (e.g., 500): The maximum number of records to fetch per poll.
java
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "my-group");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("max.poll.records", "500");
💻

Example

This example shows a Kafka consumer configured with max.poll.records set to 10. It polls messages from a topic and processes up to 10 records per poll call.

java
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import java.time.Duration;
import java.util.Collections;
import java.util.Properties;

public class MaxPollRecordsExample {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("group.id", "test-group");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("max.poll.records", "10");

        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Collections.singletonList("my-topic"));

        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(1000));
            System.out.println("Polled " + records.count() + " records");
            for (ConsumerRecord<String, String> record : records) {
                System.out.printf("Offset = %d, Key = %s, Value = %s\n", record.offset(), record.key(), record.value());
            }
        }
    }
}
Output
Polled 10 records Offset = 15, Key = key1, Value = value1 Offset = 16, Key = key2, Value = value2 ... Polled 10 records Offset = 25, Key = key11, Value = value11 ...
⚠️

Common Pitfalls

  • Setting max.poll.records too high: Can cause long processing times per poll, risking consumer group rebalances.
  • Setting it too low: May lead to inefficient processing with many small batches.
  • Ignoring max.poll.interval.ms: If processing takes longer than this interval, the consumer may be considered dead and trigger a rebalance.
  • Not committing offsets properly: If you process fewer records than polled or fail to commit, you may reprocess messages.
java
/* Wrong way: max.poll.records too high without adjusting max.poll.interval.ms */
props.put("max.poll.records", "1000");
// If processing takes longer than default max.poll.interval.ms (5 minutes), consumer will be kicked out

/* Right way: balance max.poll.records and max.poll.interval.ms */
props.put("max.poll.records", "100");
props.put("max.poll.interval.ms", "600000"); // 10 minutes
📊

Quick Reference

max.poll.records controls how many records a consumer fetches in one poll call.

  • Default: 500
  • Set lower for faster processing and lower memory use
  • Set higher for throughput but watch processing time
  • Adjust max.poll.interval.ms accordingly
ConfigurationDescriptionDefault Value
max.poll.recordsMax records returned in one poll()500
max.poll.interval.msMax time between poll() calls before consumer considered dead300000 (5 minutes)

Key Takeaways

Set max.poll.records to control batch size of records returned by poll() in Kafka consumer.
Balance max.poll.records with max.poll.interval.ms to avoid consumer group rebalances.
Too high max.poll.records can cause long processing delays; too low can reduce throughput.
Always commit offsets after processing polled records to avoid duplicates.
Adjust these settings based on your processing speed and resource availability.