0
0
KafkaHow-ToBeginner · 3 min read

How to Backup Kafka Data: Simple Steps and Examples

To backup Kafka data, use the kafka-run-class kafka.tools.ExportZkOffsets tool to export offsets and kafka-dump-log or kafka-console-consumer to export topic data. You can also copy Kafka log directories directly for a full backup of messages stored on disk.
📐

Syntax

Backing up Kafka data involves exporting topic messages or copying log files. Here are common commands:

  • kafka-console-consumer --bootstrap-server <broker> --topic <topic> --from-beginning --timeout-ms <time> > backup-file.txt: Export topic messages to a file.
  • kafka-run-class kafka.tools.ExportZkOffsets --zkconnect <zookeeper> --output-file <file>: Export consumer offsets.
  • Copy Kafka log directory: cp -r /var/lib/kafka/data /backup/location to backup raw data files.

Each part:

  • --bootstrap-server: Kafka broker address.
  • --topic: Topic name to backup.
  • --from-beginning: Read all messages from start.
  • --timeout-ms: Time to wait before stopping.
  • --zkconnect: Zookeeper connection string.
  • --output-file: File to save offsets.
bash
kafka-console-consumer --bootstrap-server localhost:9092 --topic my-topic --from-beginning --timeout-ms 10000 > backup-my-topic.txt

kafka-run-class kafka.tools.ExportZkOffsets --zkconnect localhost:2181 --output-file offsets-backup.txt

cp -r /var/lib/kafka/data /backup/kafka-data-backup
💻

Example

This example shows how to export all messages from a Kafka topic named orders to a text file and export consumer offsets from Zookeeper.

bash
kafka-console-consumer --bootstrap-server localhost:9092 --topic orders --from-beginning --timeout-ms 5000 > orders-backup.txt

kafka-run-class kafka.tools.ExportZkOffsets --zkconnect localhost:2181 --output-file consumer-offsets.txt
Output
Messages from topic 'orders' saved to orders-backup.txt Consumer offsets saved to consumer-offsets.txt
⚠️

Common Pitfalls

Common mistakes when backing up Kafka data include:

  • Not specifying --from-beginning with kafka-console-consumer, which results in backing up only new messages.
  • Forgetting to export consumer offsets, causing issues when restoring consumers.
  • Copying Kafka log files while Kafka is running, which can cause inconsistent backups.
  • Not having enough disk space for backups.

Always stop Kafka or pause producers/consumers before copying log files for a consistent backup.

bash
Wrong:
kafka-console-consumer --bootstrap-server localhost:9092 --topic orders > backup.txt

Right:
kafka-console-consumer --bootstrap-server localhost:9092 --topic orders --from-beginning --timeout-ms 5000 > backup.txt
📊

Quick Reference

Summary tips for backing up Kafka data:

  • Use kafka-console-consumer with --from-beginning to export topic data.
  • Export consumer offsets with ExportZkOffsets tool.
  • Copy Kafka log directories only when Kafka is stopped or paused.
  • Store backups in a safe, separate location.
  • Test your backups by restoring data in a test environment.

Key Takeaways

Use kafka-console-consumer with --from-beginning to export all topic messages.
Export consumer offsets separately to maintain consumer group state.
Copy Kafka log files only when Kafka is stopped to avoid data corruption.
Store backups securely and verify them regularly.
Backing up both data and offsets ensures smooth recovery.