Challenge - 5 Problems

🎖️

Kappa Architecture Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Core principle of Kappa architecture

Which statement best describes the core principle of the Kappa architecture in streaming data processing?

AIt processes data only once through a single stream processing pipeline without separate batch layers.

BIt requires separate batch and speed layers to handle data processing.

CIt stores all data in a data lake before processing in batches.

DIt uses multiple batch jobs to update the data warehouse periodically.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of a streaming filter in Kappa architecture

Given the following Hadoop streaming code snippet that filters events with value > 50, what is the output for input events [30, 60, 45, 80]?

Hadoop

input_events = [30, 60, 45, 80]
filtered_events = list(filter(lambda x: x > 50, input_events))
print(filtered_events)

A[60, 80]

B[30, 45]

C[30, 60, 45, 80]

D[]

Attempts:

2 left

❓ data_output

advanced

2:00remaining

Result of windowed aggregation in streaming

In a Kappa streaming pipeline, a 5-minute tumbling window sums event values arriving every minute: [2, 3, 5, 7, 1]. What is the sum output for the window?

C10

D18

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify error in streaming data processing code

What error will this Hadoop streaming Python code raise?

events = [10, 20, 30]
result = sum(events.filter(lambda x: x > 15))
print(result)

ASyntaxError: invalid syntax

BTypeError: 'int' object is not callable

CAttributeError: 'list' object has no attribute 'filter'

DNo error, output is 50

Attempts:

2 left

🚀 Application

expert

3:00remaining

Choosing Kappa architecture for a use case

You have a system that requires real-time fraud detection on transaction data with minimal latency and no need for batch reprocessing. Which architecture is best suited?

AData lake architecture, because it stores raw data for later analysis.

BKappa architecture, because it processes data only once in a streaming pipeline.

CBatch processing architecture, because it handles large volumes efficiently.

DLambda architecture, because it combines batch and speed layers for accuracy.

Attempts:

2 left