ExternalTaskSensor in Airflow: What It Is and How It Works
ExternalTaskSensor in Airflow is a sensor that waits for a task in a different DAG to complete before continuing. It helps coordinate workflows by pausing one DAG until another DAG's task finishes successfully.How It Works
The ExternalTaskSensor acts like a watchful friend who waits for someone else to finish their job before starting theirs. Imagine you have two workflows (DAGs) running independently, but one depends on the other to finish a specific task first. Instead of guessing or setting fixed delays, this sensor checks the status of that external task regularly.
It keeps checking until the external task is marked as successful. Once it sees the task done, it lets the current workflow continue. This way, you can link workflows smoothly without mixing their code or schedules.
Example
This example shows how to use ExternalTaskSensor to wait for a task named task_a in another DAG called dag_a before running task_b in the current DAG.
from airflow import DAG from airflow.sensors.external_task import ExternalTaskSensor from airflow.operators.dummy import DummyOperator from datetime import datetime with DAG( dag_id='dag_b', start_date=datetime(2024, 1, 1), schedule_interval='@daily', catchup=False ) as dag: wait_for_task_a = ExternalTaskSensor( task_id='wait_for_task_a', external_dag_id='dag_a', external_task_id='task_a', timeout=600, # wait max 10 minutes poke_interval=30, # check every 30 seconds mode='poke' ) task_b = DummyOperator(task_id='task_b') wait_for_task_a >> task_b
When to Use
Use ExternalTaskSensor when you have multiple DAGs that depend on each other. For example, if one DAG processes raw data and another DAG analyzes that processed data, the analysis DAG should wait until the processing DAG finishes.
This sensor is useful to avoid race conditions and ensure data consistency by making sure tasks in different workflows run in the correct order.
It is also helpful when DAGs have different schedules or owners but still need to coordinate.
Key Points
- ExternalTaskSensor waits for a task in another DAG to complete.
- It checks the external task status repeatedly until success or timeout.
- Helps coordinate workflows without merging DAGs.
- Supports parameters like
timeoutandpoke_intervalto control waiting behavior. - Useful for dependent workflows with different schedules or owners.
Key Takeaways
ExternalTaskSensor waits for a task in another DAG to finish before continuing.