0
0
Kafkadevops~10 mins

Cross-datacenter replication in Kafka - Commands & Configuration

Choose your learning style9 modes available
Introduction
When you have Kafka clusters in different data centers, you want to copy data between them automatically. Cross-datacenter replication helps keep data in sync across locations, so apps can read and write data locally but still share it globally.
When you want to keep Kafka topics synchronized between two data centers for disaster recovery.
When you have users in different regions and want to reduce latency by replicating data closer to them.
When you want to back up Kafka data from one cluster to another in a different physical location.
When you need to migrate data gradually from one Kafka cluster to another without downtime.
When you want to aggregate data from multiple Kafka clusters into a central cluster for analytics.
Config File - replicator.properties
replicator.properties
bootstrap.servers=source-kafka1:9092,source-kafka2:9092
replication.factor=3
group.id=replicator-group
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
config.storage.topic=replicator-configs
offset.storage.topic=replicator-offsets
status.storage.topic=replicator-status

# Destination cluster bootstrap servers
producer.bootstrap.servers=dest-kafka1:9092,dest-kafka2:9092

# Topics to replicate
topics=important-topic,logs-topic

# Replication policy
replication.policy.class=org.apache.kafka.connect.mirror.DefaultReplicationPolicy

# Enable replication
tasks.max=1

This configuration file sets up Kafka MirrorMaker 2 for cross-datacenter replication.

bootstrap.servers: Connects to the source Kafka cluster.

producer.bootstrap.servers: Connects to the destination Kafka cluster.

topics: Lists the topics to replicate.

replication.policy.class: Defines how topic names are mapped between clusters.

tasks.max: Number of parallel replication tasks.

Commands
Starts the Kafka MirrorMaker 2 process using the configuration file to begin replicating topics from the source to the destination cluster.
Terminal
connect-mirror-maker replicator.properties
Expected OutputExpected
INFO Starting MirrorMaker 2 INFO Connecting to source cluster at source-kafka1:9092,source-kafka2:9092 INFO Connecting to destination cluster at dest-kafka1:9092,dest-kafka2:9092 INFO Replicating topics: important-topic, logs-topic INFO MirrorMaker 2 started successfully
Lists all topics on the destination Kafka cluster to verify that the topics have been replicated.
Terminal
kafka-topics --bootstrap-server dest-kafka1:9092 --list
Expected OutputExpected
important-topic logs-topic replicator-configs replicator-offsets replicator-status
--bootstrap-server - Specifies the Kafka cluster to connect to
--list - Lists all topics in the cluster
Shows the status of the replication consumer group on the destination cluster to check replication progress and offsets.
Terminal
kafka-consumer-groups --bootstrap-server dest-kafka1:9092 --describe --group replicator-group
Expected OutputExpected
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID replicator-group important-topic 0 12345 12345 0 consumer-1 /127.0.0.1 replicator-1
--bootstrap-server - Connects to the Kafka cluster
--describe - Shows detailed information about the consumer group
--group - Specifies the consumer group to describe
Key Concept

If you remember nothing else from this pattern, remember: Kafka MirrorMaker 2 uses a connector configuration to continuously copy topics between clusters, keeping data in sync across data centers.

Common Mistakes
Not specifying the destination cluster bootstrap servers in the configuration.
Without the destination cluster info, MirrorMaker cannot send data to the target cluster, so replication fails.
Always include producer.bootstrap.servers with the destination Kafka brokers in the config file.
Trying to replicate topics that do not exist on the source cluster.
MirrorMaker cannot replicate non-existent topics, so no data will be copied.
Verify the topics exist on the source cluster before adding them to the topics list.
Running MirrorMaker without proper permissions on source or destination clusters.
Lack of permissions causes connection or write failures, stopping replication.
Ensure the user running MirrorMaker has read access on source and write access on destination clusters.
Summary
Create a MirrorMaker 2 configuration file specifying source and destination Kafka clusters and topics to replicate.
Start MirrorMaker 2 with the configuration to begin cross-datacenter replication.
Verify replication by listing topics and checking consumer group status on the destination cluster.