0
0
Apache Airflowdevops~20 mins

FileSensor for file arrival detection in Apache Airflow - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
FileSensor Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
💻 Command Output
intermediate
2:00remaining
What does this FileSensor detect?
Given the following Airflow FileSensor configuration, what file event is it waiting for?
Apache Airflow
file_sensor = FileSensor(
    task_id='wait_for_file',
    filepath='/data/input/data_ready.txt',
    poke_interval=30,
    timeout=600
)
AIt waits until the file '/data/input/data_ready.txt' is modified before continuing.
BIt waits until the file '/data/input/data_ready.txt' is deleted before continuing.
CIt waits until the directory '/data/input/' is empty before continuing.
DIt waits until the file '/data/input/data_ready.txt' exists before continuing.
Attempts:
2 left
💡 Hint
FileSensor checks for the presence of a file at the given path.
Configuration
intermediate
2:00remaining
Choose the correct FileSensor timeout setting
You want the FileSensor to stop waiting after 5 minutes if the file does not appear. Which timeout value should you set?
Atimeout=300
Btimeout=5
Ctimeout=60
Dtimeout=0
Attempts:
2 left
💡 Hint
Timeout is in seconds.
🔀 Workflow
advanced
2:00remaining
How does FileSensor behave with poke_interval?
If a FileSensor has poke_interval=10 and timeout=60, how many times will it check for the file before timing out?
A6 times
B10 times
C60 times
D1 time
Attempts:
2 left
💡 Hint
Divide timeout by poke_interval to find the number of checks.
Troubleshoot
advanced
2:00remaining
Why does FileSensor keep timing out without detecting the file?
You configured a FileSensor to watch '/tmp/data.csv' but it always times out. The file is created by another process. What could be a common cause?
AThe poke_interval is set too high.
BThe FileSensor does not support detecting files.
CThe file is created in a different directory than '/tmp'.
DThe timeout is set to zero.
Attempts:
2 left
💡 Hint
Check the exact file path the other process writes to.
Best Practice
expert
2:00remaining
What is the best practice to avoid blocking Airflow workers with FileSensor?
FileSensors can block workers while waiting for files. Which approach is best to avoid this problem?
ASet poke_interval to 1 second to check very frequently.
BUse the 'mode' parameter set to 'reschedule' to free the worker during waiting.
CIncrease timeout to a very high value to avoid task failure.
DRun FileSensor tasks on the scheduler instead of workers.
Attempts:
2 left
💡 Hint
The 'reschedule' mode releases the worker slot while waiting.