0
0
KafkaComparisonBeginner · 4 min read

KStream vs KTable in Kafka: Key Differences and Usage Guide

In Kafka, KStream represents a stream of continuous events where each record is independent, while KTable represents a changelog stream that models a table with updates and deletions. KStream processes every event as it arrives, whereas KTable maintains the latest state per key, making it suitable for stateful computations.
⚖️

Quick Comparison

Here is a quick side-by-side comparison of KStream and KTable based on key factors.

FactorKStreamKTable
Data ModelStream of independent eventsTable of latest state per key
Event ProcessingProcesses every eventProcesses updates and maintains state
StateStateless by defaultStateful with changelog support
Use CaseEvent-driven processing, transformationsStateful aggregations, joins, lookups
Update SemanticsNo inherent update, events are immutableUpdates and deletes reflected as changes
StorageNo internal storageStores latest values in state store
⚖️

Key Differences

KStream is a stream abstraction that treats each record as an independent event. It is ideal for processing continuous flows of data where each event matters on its own, such as logs or user clicks. It does not keep track of previous events or state unless explicitly programmed.

KTable, on the other hand, models a table abstraction where each key has a single latest value. It consumes a changelog stream that represents updates and deletions, maintaining the current state internally. This makes KTable suitable for scenarios like counting, aggregations, or lookups where the latest state per key is important.

While KStream processes every event as a new record, KTable processes events as updates to existing keys. This difference affects how joins and aggregations behave: KTable joins are stateful and reflect the latest data, whereas KStream joins are event-driven and stateless by default.

⚖️

Code Comparison

java
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.kstream.KStream;
import org.apache.kafka.streams.kstream.Produced;
import java.util.Properties;

public class KStreamExample {
    public static void main(String[] args) {
        StreamsBuilder builder = new StreamsBuilder();

        // Create a KStream from input topic
        KStream<String, String> stream = builder.stream("input-topic");

        // Convert values to uppercase
        KStream<String, String> upperStream = stream.mapValues(value -> value.toUpperCase());

        // Write to output topic
        upperStream.to("output-topic", Produced.with(null, null));

        Properties props = new Properties();
        props.put("application.id", "kstream-app");
        props.put("bootstrap.servers", "localhost:9092");

        KafkaStreams streams = new KafkaStreams(builder.build(), props);
        streams.start();
    }
}
Output
Records from 'input-topic' are read, values converted to uppercase, and written to 'output-topic' as a continuous stream.
↔️

KTable Equivalent

java
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.kstream.KTable;
import org.apache.kafka.streams.kstream.Produced;
import java.util.Properties;

public class KTableExample {
    public static void main(String[] args) {
        StreamsBuilder builder = new StreamsBuilder();

        // Create a KTable from input topic
        KTable<String, String> table = builder.table("input-topic");

        // Convert values to uppercase
        KTable<String, String> upperTable = table.mapValues(value -> value.toUpperCase());

        // Write to output topic
        upperTable.toStream().to("output-topic", Produced.with(null, null));

        Properties props = new Properties();
        props.put("application.id", "ktable-app");
        props.put("bootstrap.servers", "localhost:9092");

        KafkaStreams streams = new KafkaStreams(builder.build(), props);
        streams.start();
    }
}
Output
Latest values from 'input-topic' are read as a table, converted to uppercase, and changes written to 'output-topic' reflecting updates.
🎯

When to Use Which

Choose KStream when you need to process every event independently, such as event logging, filtering, or real-time transformations where the order and occurrence of each event matter.

Choose KTable when you need to maintain and query the latest state per key, such as counting occurrences, aggregating data, or performing stateful joins where updates and deletions must be tracked.

In summary, use KStream for event-driven processing and KTable for stateful, table-like views of your data.

Key Takeaways

KStream processes each event as a separate record without maintaining state.
KTable maintains the latest state per key and processes updates as changes.
Use KStream for event-driven, stateless processing.
Use KTable for stateful aggregations, lookups, and joins.
The choice depends on whether you need a stream of events or a table of current states.