0
0
KafkaConceptBeginner · 4 min read

What is KStream in Kafka: Definition and Usage

KStream in Kafka is a core abstraction in Kafka Streams API that represents a continuous stream of records, where each record is a key-value pair. It allows developers to process and transform real-time data streams in a scalable and fault-tolerant way.
⚙️

How It Works

Think of KStream as a never-ending conveyor belt carrying data records one by one. Each record has a key and a value, like a label and its content. As new data arrives, KStream lets you apply operations such as filtering, mapping, or joining, similar to sorting or modifying items on the conveyor belt.

Under the hood, Kafka Streams manages the flow of data from Kafka topics, processes it in real-time, and ensures that if something goes wrong, it can recover without losing data. This makes KStream a powerful tool for building applications that react instantly to new information.

💻

Example

This example shows how to create a KStream from a Kafka topic, convert all values to uppercase, and write the results to another topic.

java
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.kstream.KStream;

import java.util.Properties;

public class UppercaseStream {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(StreamsConfig.APPLICATION_ID_CONFIG, "uppercase-app");
        props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
        props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());

        StreamsBuilder builder = new StreamsBuilder();
        KStream<String, String> input = builder.stream("input-topic");

        KStream<String, String> uppercased = input.mapValues(value -> value.toUpperCase());

        uppercased.to("output-topic");

        KafkaStreams streams = new KafkaStreams(builder.build(), props);
        streams.start();

        Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
    }
}
Output
No direct console output; data from 'input-topic' is transformed to uppercase and sent to 'output-topic'.
🎯

When to Use

Use KStream when you need to process data as it flows in real-time, such as monitoring sensor data, tracking user activity on websites, or transforming logs for analysis. It is ideal for applications that require continuous data processing with low latency.

For example, an online store can use KStream to update inventory counts instantly as orders come in, or a financial app can detect fraud by analyzing transaction streams live.

Key Points

  • KStream represents a stream of key-value records from Kafka topics.
  • It supports real-time processing with operations like map, filter, and join.
  • Kafka Streams API handles fault tolerance and scalability automatically.
  • It is used for building event-driven, streaming applications.

Key Takeaways

KStream is a continuous stream of records for real-time processing in Kafka Streams.
It allows easy transformation and analysis of live data with simple operations.
Kafka Streams manages fault tolerance and scalability behind the scenes.
Use KStream for applications needing instant reaction to data changes.