What Is State Store in Kafka Streams: Simple Explanation and Example
state store in Kafka Streams is a local database that keeps track of data during stream processing. It helps maintain and query state information like counts or aggregates while processing continuous data streams.How It Works
Think of a state store as a notebook that your Kafka Streams application carries around. While processing data streams, it writes important information in this notebook to remember past events. This helps the app make decisions based on history, like counting how many times a word appeared.
The state store lives locally on the machine running the stream app, so it can quickly read and update data without asking a remote database. Kafka Streams also backs up this notebook by saving changes to a Kafka topic, so if the app stops and restarts, it can restore the notebook and continue where it left off.
Example
import org.apache.kafka.common.serialization.Serdes; import org.apache.kafka.streams.KafkaStreams; import org.apache.kafka.streams.StreamsBuilder; import org.apache.kafka.streams.StreamsConfig; import org.apache.kafka.streams.kstream.KStream; import org.apache.kafka.streams.state.Stores; import java.util.Properties; public class WordCountWithStateStore { public static void main(String[] args) { Properties props = new Properties(); props.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-app"); props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass()); props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass()); StreamsBuilder builder = new StreamsBuilder(); // Define a state store named "CountsStore" builder.addStateStore(Stores.keyValueStoreBuilder( Stores.persistentKeyValueStore("CountsStore"), Serdes.String(), Serdes.Long())); KStream<String, String> textLines = builder.stream("input-topic"); textLines.flatMapValues(value -> java.util.Arrays.asList(value.toLowerCase().split(" "))) .groupBy((key, word) -> word) .count() .toStream() .to("output-topic"); KafkaStreams streams = new KafkaStreams(builder.build(), props); streams.start(); Runtime.getRuntime().addShutdownHook(new Thread(streams::close)); } }
When to Use
Use a state store when your stream processing needs to remember information across many events. For example, counting clicks per user, tracking session data, or aggregating sales totals over time.
This is useful in real-time analytics, monitoring, and any case where you want to keep track of running totals or patterns without losing data if the app restarts.
Key Points
- A state store is a local database inside Kafka Streams for storing data during processing.
- It allows fast access to past data to compute aggregates or track state.
- State stores are backed up by Kafka topics for fault tolerance.
- Common types include key-value stores and window stores.