Which statement best describes the core principle of the Kappa architecture in streaming data processing?
Think about how Kappa architecture simplifies data processing by avoiding multiple layers.
Kappa architecture processes data only once through a single streaming pipeline, eliminating the need for separate batch layers.
Given the following Hadoop streaming code snippet that filters events with value > 50, what is the output for input events [30, 60, 45, 80]?
input_events = [30, 60, 45, 80] filtered_events = list(filter(lambda x: x > 50, input_events)) print(filtered_events)
Filter keeps only values greater than 50.
The filter function keeps only values greater than 50, so 60 and 80 remain.
In a Kappa streaming pipeline, a 5-minute tumbling window sums event values arriving every minute: [2, 3, 5, 7, 1]. What is the sum output for the window?
Sum all values in the 5-minute window.
Sum of 2 + 3 + 5 + 7 + 1 equals 18.
What error will this Hadoop streaming Python code raise?
events = [10, 20, 30] result = sum(events.filter(lambda x: x > 15)) print(result)
Check if 'filter' is a method of list objects in Python.
Lists in Python do not have a 'filter' method; 'filter' is a built-in function.
You have a system that requires real-time fraud detection on transaction data with minimal latency and no need for batch reprocessing. Which architecture is best suited?
Consider the need for low latency and no batch reprocessing.
Kappa architecture suits real-time needs with a single streaming pipeline, avoiding batch layers.