0
0
Kafkadevops~15 mins

Subscribing to topics in Kafka - Deep Dive

Choose your learning style9 modes available
Overview - Subscribing to topics
What is it?
Subscribing to topics in Kafka means telling a consumer to listen for messages from one or more named channels called topics. A topic is like a category or feed where messages are published. When a consumer subscribes, it receives new messages as they arrive, allowing applications to react to data in real time.
Why it matters
Without subscribing to topics, consumers would not know where to get data from Kafka. This would make it impossible to build systems that react instantly to events, like updating a dashboard or processing orders. Subscribing organizes how data flows from producers to consumers, enabling scalable and reliable communication.
Where it fits
Before learning to subscribe, you should understand Kafka topics and how producers send messages. After mastering subscription, you can learn about consumer groups, offset management, and message processing patterns to build robust streaming applications.
Mental Model
Core Idea
Subscribing to topics is like tuning a radio to specific stations to receive broadcasts you want to hear.
Think of it like...
Imagine a radio listener choosing which stations to listen to. Each station plays different music or news. By tuning in, you get only the content you want, just like a Kafka consumer subscribes to topics to get relevant messages.
┌─────────────┐       ┌───────────────┐       ┌───────────────┐
│  Producer   │──────▶│    Topic A    │──────▶│  Consumer 1   │
└─────────────┘       └───────────────┘       └───────────────┘
                             │
                             │
                             ▼
                      ┌───────────────┐       ┌───────────────┐
                      │    Topic B    │──────▶│  Consumer 2   │
                      └───────────────┘       └───────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Kafka Topics
🤔
Concept: Learn what a Kafka topic is and how it organizes messages.
A Kafka topic is a named stream of messages. Producers send messages to topics, and consumers read from them. Topics are divided into partitions for scalability and parallelism.
Result
You know that topics are the channels where messages live in Kafka.
Understanding topics is essential because subscribing means choosing which of these channels to listen to.
2
FoundationWhat Is Subscribing to a Topic
🤔
Concept: Learn the basic idea of subscribing as a way for consumers to receive messages.
Subscribing means a consumer tells Kafka which topics it wants to receive messages from. Kafka then delivers new messages from those topics to the consumer.
Result
You understand that subscribing connects consumers to topics to get data.
Knowing that subscription is the link between topics and consumers helps you see how data flows in Kafka.
3
IntermediateUsing Kafka Consumer API to Subscribe
🤔Before reading on: do you think subscribing requires specifying partitions or just topic names? Commit to your answer.
Concept: Learn how to use Kafka's consumer API to subscribe to topics by name.
In Kafka's consumer API, you call consumer.subscribe(Collections.singletonList("topic-name")) to subscribe to a topic. This tells Kafka to assign partitions automatically and deliver messages from that topic.
Result
The consumer starts receiving messages from the subscribed topic partitions.
Understanding that subscription by topic name lets Kafka handle partition assignment simplifies consumer code and scaling.
4
IntermediateSubscribing to Multiple Topics
🤔Before reading on: can a consumer subscribe to multiple topics at once? Commit to yes or no.
Concept: Learn that consumers can subscribe to many topics simultaneously.
You can pass a list of topic names to consumer.subscribe(Arrays.asList("topic1", "topic2")) to listen to multiple topics. Kafka will deliver messages from all these topics to the consumer.
Result
The consumer receives messages from all subscribed topics in a single stream.
Knowing that multiple subscriptions are possible allows flexible data consumption from various sources.
5
IntermediateUsing Pattern-Based Subscription
🤔Before reading on: do you think you can subscribe to topics using name patterns (like wildcards)? Commit to yes or no.
Concept: Learn that Kafka supports subscribing to topics matching a regular expression pattern.
Instead of listing topics, you can use consumer.subscribe(Pattern.compile("topic-prefix.*")) to subscribe to all topics whose names match the pattern. This is useful when topics are created dynamically.
Result
The consumer automatically receives messages from any topic matching the pattern, even if created later.
Pattern subscription enables dynamic and scalable consumption without manual topic list updates.
6
AdvancedPartition Assignment and Rebalancing
🤔Before reading on: do you think Kafka assigns partitions to consumers manually or automatically? Commit to your answer.
Concept: Understand how Kafka assigns topic partitions to consumers in a group and rebalances on changes.
When consumers subscribe, Kafka assigns partitions so each message is consumed by only one consumer in the group. If consumers join or leave, Kafka rebalances partitions among them to maintain load balance.
Result
Consumers get exclusive partitions to read from, ensuring no duplicate processing within the group.
Knowing partition assignment and rebalancing explains how Kafka achieves fault tolerance and scalability.
7
ExpertHandling Subscription Changes and Offset Management
🤔Before reading on: do you think changing subscriptions resets message reading positions automatically? Commit to yes or no.
Concept: Learn how changing subscriptions affects consumer offsets and how to manage them safely.
When a consumer changes its subscription, Kafka may rebalance partitions and reset offsets depending on configuration. Proper offset management ensures consumers resume reading where they left off, avoiding message loss or duplication.
Result
Consumers maintain consistent message processing even when subscriptions change dynamically.
Understanding offset behavior during subscription changes prevents common bugs in production streaming systems.
Under the Hood
Kafka consumers maintain a session with the Kafka broker cluster. When subscribing, the consumer sends its topic list or pattern to the broker coordinator. The coordinator assigns partitions to consumers in the same group to balance load. The consumer fetches messages from assigned partitions, tracking offsets locally or in Kafka. Rebalancing occurs when group membership changes, triggering partition reassignment and offset synchronization.
Why designed this way?
Kafka was designed for high throughput and fault tolerance. Automatic partition assignment and rebalancing simplify scaling and recovery. Using topic names or patterns for subscription allows flexible consumption without manual partition management. Offset tracking ensures exactly-once or at-least-once processing guarantees depending on configuration.
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│  Consumer 1   │──────▶│ Partition 0   │       │   Topic A     │
│  Consumer 2   │──────▶│ Partition 1   │──────▶│               │
│  Consumer 3   │──────▶│ Partition 2   │       │               │
└───────────────┘       └───────────────┘       └───────────────┘
          ▲                     ▲                       ▲
          │                     │                       │
          └─────────Coordinator ────────────────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does subscribing to a topic mean you get all past messages automatically? Commit to yes or no.
Common Belief:Subscribing to a topic means the consumer receives every message ever sent to that topic.
Tap to reveal reality
Reality:Consumers receive messages starting from their configured offset, which can be the latest, earliest, or a specific position. Past messages before the offset are not automatically sent.
Why it matters:Assuming all past messages arrive can cause missed data or duplicate processing if offsets are mismanaged.
Quick: Can multiple consumers in the same group read the same partition simultaneously? Commit to yes or no.
Common Belief:Multiple consumers in the same group can read the same partition at the same time for faster processing.
Tap to reveal reality
Reality:Kafka assigns each partition to only one consumer in a group to avoid duplicate processing. Multiple consumers read different partitions.
Why it matters:Misunderstanding this leads to incorrect assumptions about parallelism and can cause data duplication or processing errors.
Quick: Does subscribing with a pattern subscribe only to existing topics or also future ones? Commit to your answer.
Common Belief:Pattern subscription only applies to topics that exist at the time of subscription.
Tap to reveal reality
Reality:Pattern subscription automatically includes new topics created later that match the pattern.
Why it matters:Not knowing this can cause unexpected message consumption or missed data when topics are dynamically created.
Quick: Does changing the subscription list during runtime cause message loss? Commit to yes or no.
Common Belief:Changing subscriptions on the fly is safe and does not affect message processing.
Tap to reveal reality
Reality:Changing subscriptions triggers rebalancing, which can cause temporary pauses and requires careful offset management to avoid message loss or duplication.
Why it matters:Ignoring this can cause production outages or inconsistent data processing.
Expert Zone
1
Partition assignment strategies (range, round-robin, sticky) affect load balancing and consumer stability in subtle ways.
2
Offset commits can be automatic or manual; choosing the right mode impacts processing guarantees and performance.
3
Subscription patterns can cause unexpected topic matches if naming conventions are not carefully controlled.
When NOT to use
Subscribing directly to topics is not ideal when you need fine-grained control over partitions or want to implement custom load balancing. In such cases, manual partition assignment or Kafka Streams API might be better alternatives.
Production Patterns
In production, consumers often use consumer groups with pattern subscriptions to handle dynamic topic creation. Offset management is carefully tuned for exactly-once processing. Rebalancing listeners are implemented to handle partition changes gracefully without downtime.
Connections
Publish-Subscribe Messaging Pattern
Subscribing to Kafka topics is a concrete implementation of the publish-subscribe pattern.
Understanding Kafka subscription deepens comprehension of how pub-sub systems decouple message producers and consumers for scalable communication.
Load Balancing in Distributed Systems
Kafka's partition assignment during subscription is a form of load balancing among consumers.
Knowing Kafka's subscription mechanism helps grasp general load balancing strategies in distributed computing.
Radio Broadcasting
Subscribing to topics is like tuning into radio stations to receive broadcasts.
This cross-domain connection clarifies how selective listening enables efficient information flow.
Common Pitfalls
#1Assuming subscribing to a topic automatically reads all past messages.
Wrong approach:consumer.subscribe(Collections.singletonList("my-topic")); // then expecting all old messages without setting offset
Correct approach:consumer.subscribe(Collections.singletonList("my-topic")); consumer.seekToBeginning(consumer.assignment()); // to read from earliest offset
Root cause:Misunderstanding that subscription alone does not control offset position.
#2Multiple consumers in the same group subscribing to the same partition causing duplicate processing.
Wrong approach:Manually assigning the same partition to multiple consumers in one group.
Correct approach:Use consumer.subscribe() and let Kafka assign partitions uniquely per consumer in the group.
Root cause:Ignoring Kafka's partition assignment rules and trying manual assignment without coordination.
#3Changing subscription topics without handling rebalancing causing message loss.
Wrong approach:consumer.subscribe(newTopics); // without committing offsets or handling rebalance events
Correct approach:Commit offsets before changing subscription and implement rebalance listener to manage partition changes safely.
Root cause:Not accounting for Kafka's rebalancing process triggered by subscription changes.
Key Takeaways
Subscribing to Kafka topics connects consumers to the streams of messages they want to process.
Consumers subscribe by topic names or patterns, and Kafka manages partition assignment automatically.
Partition assignment ensures each partition is read by only one consumer in a group, enabling parallelism without duplication.
Offset management during subscription and rebalancing is critical to avoid message loss or duplication.
Understanding subscription internals helps build scalable, fault-tolerant streaming applications with Kafka.