0
0
Kafkadevops~15 mins

First message (produce and consume) in Kafka - Deep Dive

Choose your learning style9 modes available
Overview - First message (produce and consume)
What is it?
Kafka is a system that lets programs send and receive messages quickly and reliably. Producing means sending a message to Kafka, and consuming means reading that message from Kafka. This process helps different parts of a system talk to each other without being directly connected. It works like a message post office that stores and forwards messages.
Why it matters
Without producing and consuming messages, programs would have to wait for each other or be tightly linked, making systems slow and fragile. Kafka solves this by allowing messages to be sent and received independently, so systems can work faster and keep running even if parts fail. This makes apps more reliable and scalable, which is important for real-time data and big systems.
Where it fits
Before learning this, you should understand basic messaging concepts and how data flows between systems. After this, you can learn about Kafka topics, partitions, offsets, and how to handle message failures and scaling in production.
Mental Model
Core Idea
Producing sends messages into Kafka, and consuming reads them out, enabling independent communication between systems.
Think of it like...
It's like sending a letter to a mailbox (produce) and someone later picking it up and reading it (consume), without needing to be there at the same time.
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│ Producer    │─────▶│ Kafka Topic │─────▶│ Consumer    │
└─────────────┘      └─────────────┘      └─────────────┘
Build-Up - 6 Steps
1
FoundationWhat is Kafka messaging
🤔
Concept: Kafka is a system that stores and forwards messages between programs.
Kafka acts like a post office for messages. Producers send messages to Kafka, which stores them in topics. Consumers then read messages from these topics when they want.
Result
You understand Kafka as a middleman that holds messages until consumers read them.
Understanding Kafka as a message broker helps you see why producing and consuming are separate steps.
2
FoundationBasic produce and consume commands
🤔
Concept: You can send and receive messages using simple Kafka commands.
To produce a message: kafka-console-producer --broker-list localhost:9092 --topic test Type your message and press enter. To consume messages: kafka-console-consumer --bootstrap-server localhost:9092 --topic test --from-beginning This shows all messages from the start.
Result
You can send a message to Kafka and read it back from the topic.
Knowing these commands lets you try producing and consuming messages hands-on.
3
IntermediateMessage flow and offsets
🤔Before reading on: do you think consumers read messages in order or randomly? Commit to your answer.
Concept: Messages in Kafka topics have positions called offsets that track order and reading progress.
Each message in a Kafka topic has an offset number starting at 0. Consumers read messages in order by offset. Kafka remembers which offset each consumer has read, so it can continue from there next time.
Result
You see that Kafka keeps track of message order and consumer progress using offsets.
Understanding offsets explains how Kafka ensures consumers don't miss or repeat messages.
4
IntermediateProducer and consumer configuration basics
🤔Before reading on: do you think producers must wait for consumers to be ready before sending messages? Commit to your answer.
Concept: Producers and consumers have settings that control how messages are sent and received, like delivery guarantees and reading behavior.
Producers can set options like 'acks' to control how many Kafka servers confirm a message before it's considered sent. Consumers can set 'auto.offset.reset' to decide what to do if no offset is found (start at earliest or latest message). These settings affect reliability and message flow.
Result
You learn how to adjust producer and consumer behavior for your needs.
Knowing configuration options helps you balance speed, reliability, and message delivery guarantees.
5
AdvancedHandling first message in real apps
🤔Before reading on: do you think consuming always starts from the latest message or from the beginning? Commit to your answer.
Concept: In real systems, consuming the first message requires setting the right offset and handling cases where no messages exist yet.
When a consumer starts, it may find no offset saved. Setting 'auto.offset.reset=earliest' makes it read from the first message. If the topic is empty, the consumer waits for new messages. This ensures the first message is not missed even if the consumer starts late.
Result
You understand how to make sure your consumer reads the very first message produced.
Knowing how to handle offsets and empty topics prevents missing important initial messages.
6
ExpertInternal message storage and delivery guarantees
🤔Before reading on: do you think Kafka deletes messages immediately after consumption? Commit to your answer.
Concept: Kafka stores messages durably and delivers them based on configured guarantees, independent of consumption state.
Kafka writes messages to disk in partitions and keeps them for a configured time or size limit. Messages are not deleted when consumed; they remain until retention expires. This allows multiple consumers to read independently and supports replaying messages.
Result
You see Kafka as a durable log, not just a queue that deletes messages after reading.
Understanding Kafka's storage model explains why consumers can read messages at different times without loss.
Under the Hood
Kafka stores messages in partitions on disk as an ordered log. Producers append messages to the end. Consumers track offsets to know which message to read next. Kafka uses a commit log design, which means messages are immutable and stored sequentially for fast reads and writes. This design supports high throughput and fault tolerance.
Why designed this way?
Kafka was designed to handle large volumes of data with low latency and high reliability. Using a commit log allows simple, fast writes and easy recovery after failures. Keeping messages for a retention period enables multiple consumers to read independently and replay messages if needed. Alternatives like traditional queues delete messages on consumption, limiting flexibility.
┌───────────────┐
│ Producer      │
└──────┬────────┘
       │ Append message
       ▼
┌─────────────────────┐
│ Kafka Partition Log  │
│ (Ordered, Immutable) │
└──────┬──────────────┘
       │ Read by offset
       ▼
┌───────────────┐
│ Consumer      │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do you think Kafka deletes messages immediately after a consumer reads them? Commit yes or no.
Common Belief:Kafka removes messages as soon as a consumer reads them to save space.
Tap to reveal reality
Reality:Kafka keeps messages for a configured retention time or size, regardless of consumption.
Why it matters:Assuming messages disappear after reading can cause data loss if consumers restart or need to replay messages.
Quick: Do you think a producer must wait for a consumer to be ready before sending messages? Commit yes or no.
Common Belief:Producers can only send messages if consumers are actively listening.
Tap to reveal reality
Reality:Producers send messages to Kafka independently; consumers can read them later at their own pace.
Why it matters:Believing this limits system design and prevents using Kafka as a decoupling layer.
Quick: Do you think all consumers share the same reading position in Kafka? Commit yes or no.
Common Belief:All consumers read messages from the same offset, so they see the same messages at the same time.
Tap to reveal reality
Reality:Each consumer group tracks its own offset, allowing independent reading progress.
Why it matters:Misunderstanding this can cause confusion about message delivery and duplicate processing.
Quick: Do you think Kafka guarantees message order across all topics and partitions? Commit yes or no.
Common Belief:Kafka always delivers messages in the exact order they were produced across the entire topic.
Tap to reveal reality
Reality:Kafka guarantees order only within a single partition, not across multiple partitions.
Why it matters:Expecting global order can lead to bugs when processing messages from multiple partitions.
Expert Zone
1
Producers can batch messages to improve throughput but must balance latency and memory use.
2
Consumers can commit offsets manually or automatically, affecting message processing guarantees.
3
Kafka's idempotent producer feature prevents duplicate messages during retries, important for exactly-once delivery.
When NOT to use
Kafka is not ideal for low-latency request-response patterns or small-scale messaging. Alternatives like RabbitMQ or MQTT may be better for simple queueing or IoT scenarios.
Production Patterns
In production, teams use consumer groups to scale reading, configure retention policies for storage, and monitor offsets to detect lag. They also handle failures with retries and dead-letter queues.
Connections
Message Queueing
Kafka is a type of message queue but uses a log-based design instead of traditional queue semantics.
Understanding Kafka as a durable log helps differentiate it from simple queues and explains its replay and scaling abilities.
Event Sourcing
Kafka's log storage aligns with event sourcing, where all changes are stored as a sequence of events.
Knowing Kafka supports event sourcing clarifies how systems can rebuild state from message history.
Library Book Lending
Like Kafka messages, library books are kept for a time and can be borrowed independently by many readers.
This shows how Kafka allows multiple consumers to read the same data independently without removing it.
Common Pitfalls
#1Starting consumer without setting offset reset policy causes missing first messages.
Wrong approach:kafka-console-consumer --bootstrap-server localhost:9092 --topic test
Correct approach:kafka-console-consumer --bootstrap-server localhost:9092 --topic test --from-beginning
Root cause:Default offset reset is 'latest', so consumer skips existing messages if no offset is saved.
#2Assuming messages are deleted after consumption leads to data loss.
Wrong approach:Relying on Kafka to remove messages immediately after consumer reads them.
Correct approach:Configure retention policies explicitly and manage consumer offsets properly.
Root cause:Misunderstanding Kafka's retention model causes incorrect assumptions about message availability.
#3Producing messages without waiting for acknowledgments risks message loss.
Wrong approach:kafka-console-producer --broker-list localhost:9092 --topic test (send messages without acks)
Correct approach:Configure producer with 'acks=all' to ensure messages are fully committed.
Root cause:Ignoring producer acks sacrifices delivery guarantees for speed.
Key Takeaways
Producing and consuming in Kafka are separate steps that let systems communicate asynchronously.
Kafka stores messages durably in an ordered log, allowing consumers to read at their own pace.
Offsets track consumer progress and ensure messages are read in order within partitions.
Proper configuration of producers and consumers is key to reliable message delivery.
Understanding Kafka's storage and delivery model prevents common mistakes and enables powerful real-world use.