Kafka vs Pulsar: Key Differences and When to Use Each
Kafka and Pulsar are distributed messaging systems, but Kafka uses a partitioned log model with brokers storing data, while Pulsar separates storage and serving layers with a segment-based architecture. Pulsar supports multi-tenancy and geo-replication natively, making it more flexible for cloud-native and large-scale use cases.Quick Comparison
This table summarizes the main differences between Kafka and Pulsar across key factors.
| Factor | Kafka | Pulsar |
|---|---|---|
| Architecture | Monolithic broker handles storage and serving | Decoupled serving (brokers) and storage (BookKeeper) layers |
| Messaging Model | Partitioned log with topics and partitions | Topic with partitions and segments managed by BookKeeper |
| Multi-tenancy | Limited, requires separate clusters | Built-in multi-tenancy with namespaces and policies |
| Geo-replication | Supported via MirrorMaker (external tool) | Native geo-replication with configurable policies |
| Message Retention | Configurable retention by time or size | Configurable retention with tiered storage support |
| Scalability | Scale by adding brokers and partitions | Scale storage and serving independently for better elasticity |
Key Differences
Kafka uses a broker-centric architecture where each broker stores and serves data for assigned partitions. This design is simple and efficient but can limit scalability and flexibility when handling very large workloads or multi-tenant environments.
Pulsar separates the serving layer (brokers) from the storage layer (Apache BookKeeper). Brokers handle message routing and delivery, while BookKeeper manages durable storage in segments. This separation allows Pulsar to scale storage and serving independently, improving elasticity and fault tolerance.
Additionally, Pulsar supports multi-tenancy natively with namespaces and policies, making it easier to manage multiple teams or applications on the same cluster. It also offers built-in geo-replication with configurable policies, whereas Kafka requires external tools like MirrorMaker for cross-data-center replication.
Code Comparison
Here is a simple example of producing and consuming messages in Kafka using Java.
import org.apache.kafka.clients.producer.*; import org.apache.kafka.clients.consumer.*; import org.apache.kafka.common.serialization.StringSerializer; import org.apache.kafka.common.serialization.StringDeserializer; import java.time.Duration; import java.util.Collections; import java.util.Properties; public class KafkaExample { public static void main(String[] args) { String topic = "test-topic"; // Producer properties Properties producerProps = new Properties(); producerProps.put("bootstrap.servers", "localhost:9092"); producerProps.put("key.serializer", StringSerializer.class.getName()); producerProps.put("value.serializer", StringSerializer.class.getName()); // Create producer and send a message try (Producer<String, String> producer = new KafkaProducer<>(producerProps)) { ProducerRecord<String, String> record = new ProducerRecord<>(topic, "key1", "Hello Kafka"); producer.send(record); producer.flush(); System.out.println("Message sent to Kafka"); } // Consumer properties Properties consumerProps = new Properties(); consumerProps.put("bootstrap.servers", "localhost:9092"); consumerProps.put("group.id", "test-group"); consumerProps.put("key.deserializer", StringDeserializer.class.getName()); consumerProps.put("value.deserializer", StringDeserializer.class.getName()); consumerProps.put("auto.offset.reset", "earliest"); // Create consumer and poll messages try (Consumer<String, String> consumer = new KafkaConsumer<>(consumerProps)) { consumer.subscribe(Collections.singletonList(topic)); var records = consumer.poll(Duration.ofSeconds(5)); records.forEach(r -> System.out.println("Received: " + r.value())); } } }
Pulsar Equivalent
Here is the equivalent example for Pulsar using Java client to produce and consume messages.
import org.apache.pulsar.client.api.*; public class PulsarExample { public static void main(String[] args) throws PulsarClientException { String serviceUrl = "pulsar://localhost:6650"; String topic = "persistent://public/default/test-topic"; try (PulsarClient client = PulsarClient.builder() .serviceUrl(serviceUrl) .build()) { // Producer try (Producer<byte[]> producer = client.newProducer() .topic(topic) .create()) { producer.send("Hello Pulsar".getBytes()); System.out.println("Message sent to Pulsar"); } // Consumer try (Consumer<byte[]> consumer = client.newConsumer() .topic(topic) .subscriptionName("test-subscription") .subscriptionType(SubscriptionType.Exclusive) .subscribe()) { Message<byte[]> msg = consumer.receive(); System.out.println("Received: " + new String(msg.getData())); consumer.acknowledge(msg); } } } }
When to Use Which
Choose Kafka when you need a mature, widely adopted streaming platform with a large ecosystem and your use case fits well with partitioned log storage and simpler cluster management.
Choose Pulsar when you require native multi-tenancy, geo-replication, or want to scale storage and serving independently for cloud-native or large-scale applications.
Pulsar is also a good choice if you want built-in support for tiered storage and flexible messaging models beyond Kafka's core design.