0
0
KafkaHow-ToBeginner · 4 min read

How Kafka Stores Messages: Storage Mechanism Explained

Apache Kafka stores messages in topics which are divided into partitions. Each partition is an ordered, immutable sequence of messages stored as log segments on disk, ensuring durability and fast access.
📐

Syntax

Kafka stores messages in topics. Each topic is split into partitions. Partitions are stored as ordered logs on disk. Messages are appended sequentially to these logs. Each message has an offset, a unique number identifying its position in the partition.

  • Topic: Logical channel for messages.
  • Partition: Subdivision of a topic, stored as a log.
  • Log segment: Files on disk holding a chunk of messages.
  • Offset: Position of a message in a partition.
kafka
Topic -> Partition -> Log Segments -> Messages with Offsets
💻

Example

This example shows how Kafka stores messages in a topic with partitions and log segments. Each message is appended to the partition log with a unique offset.

text
Topic: orders
Partitions: 2
Partition 0 log segments:
  Segment 1: messages with offsets 0-99
  Segment 2: messages with offsets 100-199
Partition 1 log segments:
  Segment 1: messages with offsets 0-99

Message append example:
Producer sends message "Order123" to topic "orders", partition 0.
Message stored at offset 100 in partition 0 log segment 2.
Output
Message "Order123" stored at offset 100 in partition 0 of topic "orders".
⚠️

Common Pitfalls

Common mistakes when understanding Kafka message storage include:

  • Confusing topics with partitions. Messages are stored in partitions, not directly in topics.
  • Assuming messages are deleted immediately after consumption. Kafka retains messages based on retention policies, not consumer reads.
  • Ignoring that partitions are immutable logs; messages are only appended, never updated.

Correct understanding helps in designing scalable and reliable Kafka systems.

text
Wrong: Assuming messages are deleted after consumer reads
Right: Messages remain until retention time expires

Wrong: Treating topic as a single log
Right: Topic is split into multiple partitions, each a separate log
📊

Quick Reference

ConceptDescription
TopicLogical channel for messages, divided into partitions
PartitionOrdered, immutable sequence of messages stored as logs
Log SegmentChunk of messages stored as files on disk within a partition
OffsetUnique position number of a message in a partition
RetentionTime or size limit for how long messages are stored

Key Takeaways

Kafka stores messages in partitions which are ordered logs on disk.
Each message in a partition has a unique offset for identification.
Messages are appended and never modified or deleted immediately.
Retention policies control how long messages stay stored.
Understanding partitions and offsets is key to using Kafka effectively.