0
0
Hadoopdata~20 mins

Kappa architecture (streaming only) in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Kappa Architecture Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Core principle of Kappa architecture

Which statement best describes the core principle of the Kappa architecture in streaming data processing?

AIt processes data only once through a single stream processing pipeline without separate batch layers.
BIt requires separate batch and speed layers to handle data processing.
CIt stores all data in a data lake before processing in batches.
DIt uses multiple batch jobs to update the data warehouse periodically.
Attempts:
2 left
💡 Hint

Think about how Kappa architecture simplifies data processing by avoiding multiple layers.

Predict Output
intermediate
2:00remaining
Output of a streaming filter in Kappa architecture

Given the following Hadoop streaming code snippet that filters events with value > 50, what is the output for input events [30, 60, 45, 80]?

Hadoop
input_events = [30, 60, 45, 80]
filtered_events = list(filter(lambda x: x > 50, input_events))
print(filtered_events)
A[60, 80]
B[30, 45]
C[30, 60, 45, 80]
D[]
Attempts:
2 left
💡 Hint

Filter keeps only values greater than 50.

data_output
advanced
2:00remaining
Result of windowed aggregation in streaming

In a Kappa streaming pipeline, a 5-minute tumbling window sums event values arriving every minute: [2, 3, 5, 7, 1]. What is the sum output for the window?

A7
B5
C10
D18
Attempts:
2 left
💡 Hint

Sum all values in the 5-minute window.

🔧 Debug
advanced
2:00remaining
Identify error in streaming data processing code

What error will this Hadoop streaming Python code raise?

events = [10, 20, 30]
result = sum(events.filter(lambda x: x > 15))
print(result)
ASyntaxError: invalid syntax
BTypeError: 'int' object is not callable
CAttributeError: 'list' object has no attribute 'filter'
DNo error, output is 50
Attempts:
2 left
💡 Hint

Check if 'filter' is a method of list objects in Python.

🚀 Application
expert
3:00remaining
Choosing Kappa architecture for a use case

You have a system that requires real-time fraud detection on transaction data with minimal latency and no need for batch reprocessing. Which architecture is best suited?

AData lake architecture, because it stores raw data for later analysis.
BKappa architecture, because it processes data only once in a streaming pipeline.
CBatch processing architecture, because it handles large volumes efficiently.
DLambda architecture, because it combines batch and speed layers for accuracy.
Attempts:
2 left
💡 Hint

Consider the need for low latency and no batch reprocessing.