0
0
Hadoopdata~20 mins

NiFi for data flow automation in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
NiFi Data Flow Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Understanding NiFi FlowFiles

In Apache NiFi, what is a FlowFile?

AA container that holds data and attributes moving through NiFi
BA data record stored in HDFS for batch processing
CA script that automates data ingestion in NiFi
DA configuration file for NiFi processors
Attempts:
2 left
💡 Hint

Think about what moves inside NiFi between processors.

Predict Output
intermediate
1:30remaining
NiFi Expression Language Output

What is the output of this NiFi Expression Language snippet if the attribute 'filename' is 'data_2024.csv'?

${filename:substring(5,9)}
A024.
B2024
C202
Ddata
Attempts:
2 left
💡 Hint

Substring(start, end) extracts characters from start index up to but not including end index.

data_output
advanced
2:00remaining
NiFi Data Provenance Query Result

You run a NiFi Data Provenance query filtering events by a specific processor and time range. The query returns 3 events. What does this output represent?

AThree processors active during the time range
BThree configuration changes made to NiFi
CThree errors logged by NiFi in that time range
DThree FlowFiles processed by that processor in the time range
Attempts:
2 left
💡 Hint

Data Provenance tracks FlowFile events like creation, modification, and transfer.

🔧 Debug
advanced
2:00remaining
Troubleshooting NiFi Processor Failure

Given this NiFi processor configuration snippet, which option explains why the processor fails to start?

Properties:
Input Directory: /data/input
File Filter: *.csv
Scheduling Strategy: Timer-driven
Run Schedule: 0 sec
AInput Directory path must be absolute, '/data/input' is relative
BFile Filter '*.csv' is invalid syntax and causes failure
CRun Schedule set to 0 seconds causes invalid scheduling interval error
DScheduling Strategy 'Timer-driven' is deprecated and causes failure
Attempts:
2 left
💡 Hint

Check if the run schedule value is valid for timer-driven scheduling.

🚀 Application
expert
3:00remaining
Designing a NiFi Flow for Real-Time Data Filtering

You need to design a NiFi flow that reads streaming data from Kafka, filters records where 'status' equals 'error', and writes them to HDFS. Which sequence of processors correctly implements this?

AConsumeKafka -> QueryRecord (filter status=='error') -> PutHDFS
BGetFile -> UpdateAttribute (status=='error') -> PutHDFS
CConsumeKafka -> RouteOnAttribute (status=='error') -> PutHDFS
DListenHTTP -> RouteOnContent (status=='error') -> PutFile
Attempts:
2 left
💡 Hint

Consider processors that can consume Kafka and filter records by content.