0
0
Kafkadevops~10 mins

Why stream processing transforms data in Kafka - Visual Breakdown

Choose your learning style9 modes available
Process Flow - Why stream processing transforms data
Data enters stream
Stream processing starts
Apply transformation logic
Data changes form or content
Output transformed data
Data ready for next step or storage
Data flows into a stream, gets transformed step-by-step, and then outputs in a new form for further use.
Execution Sample
Kafka
inputStream.map(value -> value.toUpperCase())
           .filter(value -> value.startsWith("A"))
           .to(outputTopic);
This code transforms each input string to uppercase, then keeps only those starting with 'A', and sends them to an output topic.
Process Table
StepInput ValueTransformation AppliedResultOutput Action
1appletoUpperCaseAPPLECheck filter
2APPLEfilter startsWith 'A'trueSend to output
3bananatoUpperCaseBANANACheck filter
4BANANAfilter startsWith 'A'falseDiscard
5avocadotoUpperCaseAVOCADOCheck filter
6AVOCADOfilter startsWith 'A'trueSend to output
💡 All input values processed; stream ends or waits for new data.
Status Tracker
VariableStartAfter 1After 2After 3After 4After 5After 6Final
inputValueappleAPPLEbananaBANANAavocadoAVOCADO--
transformedValue-APPLE-BANANA-AVOCADO--
passesFilter-true-false-true--
outputSent-yes-no-yes--
Key Moments - 3 Insights
Why do we transform data before filtering it?
Transforming data first (like to uppercase) ensures the filter works consistently, as shown in rows 1-2 and 5-6 where uppercase helps the filter check.
What happens to data that does not pass the filter?
Data failing the filter (like 'BANANA' in row 4) is discarded and not sent to output, stopping further processing for that item.
Why is stream processing useful for real-time data?
It transforms and filters data instantly as it flows, allowing quick decisions and outputs without waiting for batch processing.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the transformedValue at Step 3?
Abanana
BBANANA
CAPPLE
Davocado
💡 Hint
Check the 'Result' column at Step 3 in the execution_table.
At which step is the input discarded because it does not pass the filter?
AStep 2
BStep 6
CStep 4
DStep 1
💡 Hint
Look for 'Discard' in the 'Output Action' column in the execution_table.
If we remove the toUpperCase transformation, what would happen to the filter results?
AFilter might miss values due to case mismatch
BAll values would be discarded
CFilter would still work the same
DAll values would pass the filter
💡 Hint
Refer to the key_moments about why transformation before filtering matters.
Concept Snapshot
Stream processing transforms data as it flows.
Transformations change data form (e.g., uppercase).
Filters select data based on conditions.
Output sends transformed data forward.
This enables real-time, continuous data handling.
Full Transcript
Stream processing takes data as it comes in and changes it step-by-step. First, data enters the stream. Then, a transformation like changing text to uppercase happens. Next, a filter checks if the data meets a condition, such as starting with 'A'. If it passes, the data is sent to the output. If not, it is discarded. This process repeats for each piece of data, allowing quick and continuous updates. Transforming before filtering ensures the filter works correctly. Data that does not pass the filter stops moving forward. This method helps handle data in real time without waiting for batches.