0
0
KafkaConceptBeginner · 3 min read

What Is State Store in Kafka Streams: Simple Explanation and Example

A state store in Kafka Streams is a local database that keeps track of data during stream processing. It helps maintain and query state information like counts or aggregates while processing continuous data streams.
⚙️

How It Works

Think of a state store as a notebook that your Kafka Streams application carries around. While processing data streams, it writes important information in this notebook to remember past events. This helps the app make decisions based on history, like counting how many times a word appeared.

The state store lives locally on the machine running the stream app, so it can quickly read and update data without asking a remote database. Kafka Streams also backs up this notebook by saving changes to a Kafka topic, so if the app stops and restarts, it can restore the notebook and continue where it left off.

💻

Example

This example shows how to create a simple key-value state store in Kafka Streams to count words.
java
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.kstream.KStream;
import org.apache.kafka.streams.state.Stores;
import java.util.Properties;

public class WordCountWithStateStore {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-app");
        props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
        props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());

        StreamsBuilder builder = new StreamsBuilder();

        // Define a state store named "CountsStore"
        builder.addStateStore(Stores.keyValueStoreBuilder(
                Stores.persistentKeyValueStore("CountsStore"),
                Serdes.String(),
                Serdes.Long()));

        KStream<String, String> textLines = builder.stream("input-topic");

        textLines.flatMapValues(value -> java.util.Arrays.asList(value.toLowerCase().split(" ")))
                .groupBy((key, word) -> word)
                .count()
                .toStream()
                .to("output-topic");

        KafkaStreams streams = new KafkaStreams(builder.build(), props);
        streams.start();

        Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
    }
}
Output
The application reads lines from 'input-topic', counts words using the state store 'CountsStore', and writes counts to 'output-topic'.
🎯

When to Use

Use a state store when your stream processing needs to remember information across many events. For example, counting clicks per user, tracking session data, or aggregating sales totals over time.

This is useful in real-time analytics, monitoring, and any case where you want to keep track of running totals or patterns without losing data if the app restarts.

Key Points

  • A state store is a local database inside Kafka Streams for storing data during processing.
  • It allows fast access to past data to compute aggregates or track state.
  • State stores are backed up by Kafka topics for fault tolerance.
  • Common types include key-value stores and window stores.

Key Takeaways

A state store keeps track of data locally during Kafka Streams processing.
It enables real-time aggregation and stateful computations.
State stores are fault-tolerant by backing up data to Kafka topics.
Use state stores for counting, session tracking, and other stateful tasks.