0
0
AirflowConceptBeginner · 3 min read

ExternalTaskSensor in Airflow: What It Is and How It Works

The ExternalTaskSensor in Airflow is a sensor that waits for a task in a different DAG to complete before continuing. It helps coordinate workflows by pausing one DAG until another DAG's task finishes successfully.
⚙️

How It Works

The ExternalTaskSensor acts like a watchful friend who waits for someone else to finish their job before starting theirs. Imagine you have two workflows (DAGs) running independently, but one depends on the other to finish a specific task first. Instead of guessing or setting fixed delays, this sensor checks the status of that external task regularly.

It keeps checking until the external task is marked as successful. Once it sees the task done, it lets the current workflow continue. This way, you can link workflows smoothly without mixing their code or schedules.

💻

Example

This example shows how to use ExternalTaskSensor to wait for a task named task_a in another DAG called dag_a before running task_b in the current DAG.

python
from airflow import DAG
from airflow.sensors.external_task import ExternalTaskSensor
from airflow.operators.dummy import DummyOperator
from datetime import datetime

with DAG(
    dag_id='dag_b',
    start_date=datetime(2024, 1, 1),
    schedule_interval='@daily',
    catchup=False
) as dag:

    wait_for_task_a = ExternalTaskSensor(
        task_id='wait_for_task_a',
        external_dag_id='dag_a',
        external_task_id='task_a',
        timeout=600,  # wait max 10 minutes
        poke_interval=30,  # check every 30 seconds
        mode='poke'
    )

    task_b = DummyOperator(task_id='task_b')

    wait_for_task_a >> task_b
Output
The DAG 'dag_b' will pause at 'wait_for_task_a' until 'task_a' in 'dag_a' completes successfully, then 'task_b' runs.
🎯

When to Use

Use ExternalTaskSensor when you have multiple DAGs that depend on each other. For example, if one DAG processes raw data and another DAG analyzes that processed data, the analysis DAG should wait until the processing DAG finishes.

This sensor is useful to avoid race conditions and ensure data consistency by making sure tasks in different workflows run in the correct order.

It is also helpful when DAGs have different schedules or owners but still need to coordinate.

Key Points

  • ExternalTaskSensor waits for a task in another DAG to complete.
  • It checks the external task status repeatedly until success or timeout.
  • Helps coordinate workflows without merging DAGs.
  • Supports parameters like timeout and poke_interval to control waiting behavior.
  • Useful for dependent workflows with different schedules or owners.

Key Takeaways

ExternalTaskSensor waits for a task in another DAG to finish before continuing.
It helps coordinate multiple workflows by linking their task dependencies across DAGs.
You can control how often it checks and how long it waits using parameters.
Use it to avoid running tasks too early and ensure data or process readiness.
It works well when DAGs have different schedules or are managed separately.