How Kafka Stores Messages: Storage Mechanism Explained
Apache Kafka stores messages in
topics which are divided into partitions. Each partition is an ordered, immutable sequence of messages stored as log segments on disk, ensuring durability and fast access.Syntax
Kafka stores messages in topics. Each topic is split into partitions. Partitions are stored as ordered logs on disk. Messages are appended sequentially to these logs. Each message has an offset, a unique number identifying its position in the partition.
Topic: Logical channel for messages.Partition: Subdivision of a topic, stored as a log.Log segment: Files on disk holding a chunk of messages.Offset: Position of a message in a partition.
kafka
Topic -> Partition -> Log Segments -> Messages with Offsets
Example
This example shows how Kafka stores messages in a topic with partitions and log segments. Each message is appended to the partition log with a unique offset.
text
Topic: orders Partitions: 2 Partition 0 log segments: Segment 1: messages with offsets 0-99 Segment 2: messages with offsets 100-199 Partition 1 log segments: Segment 1: messages with offsets 0-99 Message append example: Producer sends message "Order123" to topic "orders", partition 0. Message stored at offset 100 in partition 0 log segment 2.
Output
Message "Order123" stored at offset 100 in partition 0 of topic "orders".
Common Pitfalls
Common mistakes when understanding Kafka message storage include:
- Confusing
topicswithpartitions. Messages are stored in partitions, not directly in topics. - Assuming messages are deleted immediately after consumption. Kafka retains messages based on retention policies, not consumer reads.
- Ignoring that partitions are immutable logs; messages are only appended, never updated.
Correct understanding helps in designing scalable and reliable Kafka systems.
text
Wrong: Assuming messages are deleted after consumer reads Right: Messages remain until retention time expires Wrong: Treating topic as a single log Right: Topic is split into multiple partitions, each a separate log
Quick Reference
| Concept | Description |
|---|---|
| Topic | Logical channel for messages, divided into partitions |
| Partition | Ordered, immutable sequence of messages stored as logs |
| Log Segment | Chunk of messages stored as files on disk within a partition |
| Offset | Unique position number of a message in a partition |
| Retention | Time or size limit for how long messages are stored |
Key Takeaways
Kafka stores messages in partitions which are ordered logs on disk.
Each message in a partition has a unique offset for identification.
Messages are appended and never modified or deleted immediately.
Retention policies control how long messages stay stored.
Understanding partitions and offsets is key to using Kafka effectively.