What is Replication in Kafka: Explanation and Example
Kafka, replication means copying data from one broker to others to keep multiple copies of the same data. This helps protect against data loss if a broker fails by allowing other brokers to continue serving the data.How It Works
Replication in Kafka works like making backup copies of important files on different computers. Imagine you have a photo saved on your laptop, and you copy it to your phone and a USB drive. If your laptop breaks, you still have the photo on your phone or USB. Kafka does the same with messages by copying them to multiple brokers.
Each piece of data in Kafka is stored in a topic partition. Kafka keeps one broker as the leader for that partition, and the others as followers. The leader handles all reads and writes, while followers copy the data from the leader. If the leader fails, one follower takes over, so the data is always available.
Example
kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 3
When to Use
Use replication in Kafka when you want to keep your data safe and available even if some servers fail. It is essential for systems that need high reliability, like financial transactions, logging, or real-time analytics.
For example, if you run an online store, replication ensures that order data is not lost if a server crashes. It also helps maintain smooth service without interruptions.
Key Points
- Replication copies data across multiple Kafka brokers.
- It prevents data loss and improves fault tolerance.
- The leader broker handles all client requests; followers replicate data.
- If the leader fails, a follower becomes the new leader automatically.
- Replication factor defines how many copies of data exist.