0
0
Hadoopdata~5 mins

Kappa architecture (streaming only) in Hadoop - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Kappa architecture (streaming only)
O(n)
Understanding Time Complexity

We want to understand how the time needed to process data grows when using Kappa architecture for streaming data.

How does the system handle more data as it keeps coming in?

Scenario Under Consideration

Analyze the time complexity of the following streaming processing code snippet.

// Pseudocode for streaming data processing in Kappa architecture
stream = readFromKafka(topic)
processedStream = stream.map(record => transform(record))
processedStream.foreachBatch(batch => {
  batch.writeToStorage()
})

This code reads data continuously from a stream, transforms each record, and writes batches to storage.

Identify Repeating Operations

Look at what repeats as data flows in.

  • Primary operation: Processing each record in the stream (map transformation).
  • How many times: Once per incoming record, continuously as data arrives.
How Execution Grows With Input

As more records come in, the system processes each one individually.

Input Size (n)Approx. Operations
1010 transformations
100100 transformations
10001000 transformations

Pattern observation: The number of operations grows directly with the number of records.

Final Time Complexity

Time Complexity: O(n)

This means the time to process data grows linearly with the number of incoming records.

Common Mistake

[X] Wrong: "Streaming processing time stays the same no matter how much data arrives."

[OK] Correct: Each new record needs processing, so more data means more work and more time.

Interview Connect

Understanding how streaming systems scale with data helps you explain real-time data processing clearly and confidently.

Self-Check

"What if the transform step became more complex and took longer per record? How would the time complexity change?"