KStream vs KTable in Kafka: Key Differences and Usage Guide
KStream represents a stream of continuous events where each record is independent, while KTable represents a changelog stream that models a table with updates and deletions. KStream processes every event as it arrives, whereas KTable maintains the latest state per key, making it suitable for stateful computations.Quick Comparison
Here is a quick side-by-side comparison of KStream and KTable based on key factors.
| Factor | KStream | KTable |
|---|---|---|
| Data Model | Stream of independent events | Table of latest state per key |
| Event Processing | Processes every event | Processes updates and maintains state |
| State | Stateless by default | Stateful with changelog support |
| Use Case | Event-driven processing, transformations | Stateful aggregations, joins, lookups |
| Update Semantics | No inherent update, events are immutable | Updates and deletes reflected as changes |
| Storage | No internal storage | Stores latest values in state store |
Key Differences
KStream is a stream abstraction that treats each record as an independent event. It is ideal for processing continuous flows of data where each event matters on its own, such as logs or user clicks. It does not keep track of previous events or state unless explicitly programmed.
KTable, on the other hand, models a table abstraction where each key has a single latest value. It consumes a changelog stream that represents updates and deletions, maintaining the current state internally. This makes KTable suitable for scenarios like counting, aggregations, or lookups where the latest state per key is important.
While KStream processes every event as a new record, KTable processes events as updates to existing keys. This difference affects how joins and aggregations behave: KTable joins are stateful and reflect the latest data, whereas KStream joins are event-driven and stateless by default.
Code Comparison
import org.apache.kafka.streams.KafkaStreams; import org.apache.kafka.streams.StreamsBuilder; import org.apache.kafka.streams.kstream.KStream; import org.apache.kafka.streams.kstream.Produced; import java.util.Properties; public class KStreamExample { public static void main(String[] args) { StreamsBuilder builder = new StreamsBuilder(); // Create a KStream from input topic KStream<String, String> stream = builder.stream("input-topic"); // Convert values to uppercase KStream<String, String> upperStream = stream.mapValues(value -> value.toUpperCase()); // Write to output topic upperStream.to("output-topic", Produced.with(null, null)); Properties props = new Properties(); props.put("application.id", "kstream-app"); props.put("bootstrap.servers", "localhost:9092"); KafkaStreams streams = new KafkaStreams(builder.build(), props); streams.start(); } }
KTable Equivalent
import org.apache.kafka.streams.KafkaStreams; import org.apache.kafka.streams.StreamsBuilder; import org.apache.kafka.streams.kstream.KTable; import org.apache.kafka.streams.kstream.Produced; import java.util.Properties; public class KTableExample { public static void main(String[] args) { StreamsBuilder builder = new StreamsBuilder(); // Create a KTable from input topic KTable<String, String> table = builder.table("input-topic"); // Convert values to uppercase KTable<String, String> upperTable = table.mapValues(value -> value.toUpperCase()); // Write to output topic upperTable.toStream().to("output-topic", Produced.with(null, null)); Properties props = new Properties(); props.put("application.id", "ktable-app"); props.put("bootstrap.servers", "localhost:9092"); KafkaStreams streams = new KafkaStreams(builder.build(), props); streams.start(); } }
When to Use Which
Choose KStream when you need to process every event independently, such as event logging, filtering, or real-time transformations where the order and occurrence of each event matter.
Choose KTable when you need to maintain and query the latest state per key, such as counting occurrences, aggregating data, or performing stateful joins where updates and deletions must be tracked.
In summary, use KStream for event-driven processing and KTable for stateful, table-like views of your data.
Key Takeaways
KStream processes each event as a separate record without maintaining state.KTable maintains the latest state per key and processes updates as changes.KStream for event-driven, stateless processing.KTable for stateful aggregations, lookups, and joins.