ExternalTaskSensor for cross-DAG dependencies in Apache Airflow - Time & Space Complexity
When using ExternalTaskSensor in Airflow, we want to know how waiting for another DAG's task affects execution time.
We ask: How does the sensor's checking grow as the number of dependencies or wait time increases?
Analyze the time complexity of this ExternalTaskSensor usage.
from airflow.sensors.external_task import ExternalTaskSensor
sensor = ExternalTaskSensor(
task_id='wait_for_task',
external_dag_id='other_dag',
external_task_id='task_to_wait_for',
poke_interval=60,
timeout=3600
)
This sensor waits for a task in another DAG to complete by checking every 60 seconds, timing out after 1 hour.
The sensor repeatedly checks if the external task is done.
- Primary operation: Periodic database query to check task status.
- How many times: Number of checks = timeout / poke_interval (e.g., 3600/60 = 60 times).
The number of checks grows linearly with the timeout length and inversely with poke interval.
| Input Size (timeout in seconds) | Approx. Checks |
|---|---|
| 600 (10 min) | 10 (if poke_interval=60s) |
| 3600 (1 hour) | 60 |
| 7200 (2 hours) | 120 |
Pattern observation: More waiting time means more checks, growing in a straight line.
Time Complexity: O(n)
This means the sensor's work grows directly with the number of checks it performs while waiting.
[X] Wrong: "The sensor checks only once, so time doesn't grow with timeout."
[OK] Correct: The sensor keeps checking repeatedly until the task finishes or timeout, so more wait means more checks.
Understanding how sensors wait and check helps you design efficient workflows and avoid unnecessary delays in real projects.
"What if we reduce the poke_interval to 10 seconds? How would the time complexity change?"