What is Consumer Group in Kafka: Explanation and Example
consumer group in Kafka is a set of consumers that work together to read data from topics. Each message in a topic partition is delivered to only one consumer in the group, enabling parallel processing and load balancing.How It Works
Imagine a group of friends sharing a box of letters. Each friend takes turns reading different letters so no one reads the same letter twice. In Kafka, a consumer group works similarly: it is a group of consumers that share the work of reading messages from topic partitions.
Kafka topics are split into partitions, and each partition can be read by only one consumer in the group at a time. This means the group divides the partitions among its members, so messages are processed in parallel without duplication. If a consumer leaves, Kafka redistributes the partitions to remaining consumers, ensuring continuous processing.
Example
This example shows how to create a Kafka consumer that joins a consumer group and reads messages from a topic.
from kafka import KafkaConsumer # Create a consumer that joins 'my-group' consumer group consumer = KafkaConsumer( 'my-topic', group_id='my-group', bootstrap_servers=['localhost:9092'], auto_offset_reset='earliest' ) print('Listening for messages...') for message in consumer: print(f'Received message: {message.value.decode()} from partition {message.partition}')
When to Use
Use consumer groups when you want to scale message processing across multiple consumers. For example, if you have a high volume of data, multiple consumers in a group can share the load and process messages faster.
Consumer groups are also useful for fault tolerance. If one consumer fails, others in the group take over its partitions, so processing continues without interruption.
Common use cases include real-time data processing, log aggregation, and event-driven applications where parallel processing and reliability are important.
Key Points
- A consumer group allows multiple consumers to share the work of reading messages.
- Each partition’s messages go to only one consumer in the group.
- Consumer groups enable parallel processing and load balancing.
- If a consumer leaves, Kafka redistributes partitions to others.
- They improve scalability and fault tolerance in Kafka applications.