0
0
Apache Airflowdevops~20 mins

Why orchestration is needed for data pipelines in Apache Airflow - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Orchestration Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why use orchestration in data pipelines?

Imagine you have several tasks to run in a specific order to process data. Why is orchestration important in this scenario?

AIt automatically manages the order and dependencies of tasks to ensure correct execution.
BIt stores large amounts of data for faster access during processing.
CIt replaces the need for writing any code in data processing tasks.
DIt only monitors the hardware usage of the servers running the tasks.
Attempts:
2 left
💡 Hint

Think about how tasks depend on each other and need to run in a certain sequence.

💻 Command Output
intermediate
2:00remaining
Airflow DAG execution order output

Given a DAG with tasks A → B → C, what is the order of task execution shown in the Airflow logs?

Apache Airflow
Task A started
Task A completed
Task B started
Task B completed
Task C started
Task C completed
AC, B, A
BA, B, C
CB, A, C
DA, C, B
Attempts:
2 left
💡 Hint

Look at the order tasks start and complete in the logs.

🔀 Workflow
advanced
2:00remaining
Identify the correct Airflow DAG dependency setup

Which Airflow DAG code snippet correctly sets task B to run after task A?

Atask_a >> task_b
Btask_b >> task_a
Ctask_a + task_b
Dtask_a & task_b
Attempts:
2 left
💡 Hint

In Airflow, the '>>' operator sets downstream dependencies.

Troubleshoot
advanced
2:00remaining
Why does a task in Airflow not run despite being scheduled?

You scheduled a task in Airflow, but it never runs. What is the most likely reason?

AThe Airflow webserver is down but scheduler is running.
BThe task's Python code has print statements.
CThe task has no retries configured.
DThe task's upstream dependencies have not completed successfully.
Attempts:
2 left
💡 Hint

Think about task dependencies and what controls task execution.

Best Practice
expert
3:00remaining
Best practice for handling task failures in Airflow pipelines

What is the best practice to handle a task failure in an Airflow data pipeline to avoid blocking the entire workflow?

AManually restart the Airflow scheduler after failure.
BIgnore failures and let the pipeline continue without retries.
CSet retries with delay and use alerting to notify on failure.
DRemove all dependencies so tasks run independently.
Attempts:
2 left
💡 Hint

Consider how to automatically recover and inform the team about issues.