0
0
Kafkadevops~5 mins

Standalone vs distributed mode in Kafka - CLI Comparison

Choose your learning style9 modes available
Introduction
Kafka can run in two ways: standalone mode for simple testing and learning, and distributed mode for real-world use with multiple servers working together. Standalone mode runs everything on one machine, while distributed mode spreads tasks across many machines to handle more data and stay reliable.
When you want to try Kafka quickly on your laptop without setting up multiple servers
When you need to test your Kafka setup or application code before going live
When you want to run Kafka in production with high availability and fault tolerance
When you want to handle large volumes of data across multiple servers
When you want Kafka to keep working even if some servers fail
Commands
Starts Kafka in standalone mode using the default single-node configuration file.
Terminal
kafka-server-start.sh /usr/local/kafka/config/server.properties
Expected OutputExpected
[2024-06-01 12:00:00,000] INFO Kafka started (kafka.server.KafkaServer) [2024-06-01 12:00:00,001] INFO Awaiting socket connections on port 9092 (kafka.network.SocketServer)
Creates a topic named 'test-topic' with one partition and no replication, suitable for standalone mode.
Terminal
kafka-topics.sh --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
Expected OutputExpected
Created topic test-topic.
--partitions - Number of partitions for the topic
--replication-factor - Number of copies of data for fault tolerance
Starts three Kafka servers in distributed mode, each with its own configuration file to run on different ports and form a cluster.
Terminal
kafka-server-start.sh /usr/local/kafka/config/server-1.properties &
kafka-server-start.sh /usr/local/kafka/config/server-2.properties &
kafka-server-start.sh /usr/local/kafka/config/server-3.properties &
Expected OutputExpected
[2024-06-01 12:01:00,000] INFO Kafka server started on port 9092 (server-1) [2024-06-01 12:01:00,100] INFO Kafka server started on port 9093 (server-2) [2024-06-01 12:01:00,200] INFO Kafka server started on port 9094 (server-3)
Creates a topic named 'distributed-topic' with three partitions and replication factor three for fault tolerance in distributed mode.
Terminal
kafka-topics.sh --create --topic distributed-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 3
Expected OutputExpected
Created topic distributed-topic.
--partitions - Number of partitions to spread load
--replication-factor - Number of copies to keep data safe
Shows details about the distributed topic, including partitions and replicas, to verify the distributed setup.
Terminal
kafka-topics.sh --describe --topic distributed-topic --bootstrap-server localhost:9092
Expected OutputExpected
Topic: distributed-topic PartitionCount: 3 ReplicationFactor: 3 Configs: Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3 Partition: 1 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1 Partition: 2 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Key Concept

If you remember nothing else from this pattern, remember: standalone mode is for simple testing on one machine, while distributed mode uses multiple servers to handle more data and stay reliable.

Common Mistakes
Trying to use replication in standalone mode by setting replication-factor greater than 1
Standalone mode runs only one Kafka server, so replication cannot work and the command will fail.
Use replication-factor 1 in standalone mode or switch to distributed mode with multiple servers for replication.
Starting multiple Kafka servers with the same configuration file and port
Multiple servers cannot run on the same port and configuration, causing startup errors.
Create separate configuration files with different ports and broker IDs for each Kafka server in distributed mode.
Summary
Start Kafka with a single server for quick testing using the default config file.
Create topics with one partition and no replication in standalone mode.
Run multiple Kafka servers with different configs to form a distributed cluster.
Create topics with multiple partitions and replication for fault tolerance in distributed mode.
Use commands to verify topic details and cluster setup.