How to Use set_downstream in Airflow for Task Dependencies
In Airflow, use
set_downstream on a task to specify which tasks should run after it. For example, task1.set_downstream(task2) means task2 runs after task1. This helps control the order of task execution in your DAG.Syntax
The set_downstream method is called on a task object and takes one or more task objects as arguments. It sets the given tasks to run after the current task.
- task.set_downstream(other_task): Makes
other_taskrun aftertask. - You can pass a single task or a list of tasks.
python
task1.set_downstream(task2) task1.set_downstream([task2, task3])
Example
This example shows how to create two tasks and use set_downstream to make the second task run after the first.
python
from airflow import DAG from airflow.operators.bash import BashOperator from datetime import datetime default_args = { 'start_date': datetime(2024, 1, 1), } dag = DAG('example_set_downstream', default_args=default_args, schedule_interval='@once') # Define tasks task1 = BashOperator( task_id='task1', bash_command='echo "Task 1 running"', dag=dag ) task2 = BashOperator( task_id='task2', bash_command='echo "Task 2 running"', dag=dag ) # Set task2 to run after task1 task1.set_downstream(task2)
Output
When the DAG runs, the output will be:
Task 1 running
Task 2 running
This shows task2 runs only after task1 completes.
Common Pitfalls
Common mistakes when using set_downstream include:
- Calling
set_downstreamon the wrong task, reversing the order. - Not passing a task or list of tasks, causing errors.
- Mixing
set_downstreamwith other dependency methods inconsistently.
Always ensure the task you call set_downstream on is the one that should run first.
python
from airflow.operators.bash import BashOperator # Wrong order (task2 runs before task1, which is incorrect) task2.set_downstream(task1) # This reverses the intended order # Correct order task1.set_downstream(task2)
Quick Reference
| Method | Description | Example |
|---|---|---|
| set_downstream | Sets tasks to run after the current task | task1.set_downstream(task2) |
| set_upstream | Sets tasks to run before the current task | task2.set_upstream(task1) |
| Bitshift operator >> | Newer syntax for downstream dependency | task1 >> task2 |
| Bitshift operator << | Newer syntax for upstream dependency | task2 << task1 |
Key Takeaways
Use set_downstream to specify tasks that run after the current task in Airflow.
Call set_downstream on the task that should run first, passing the next task(s) as argument(s).
You can pass a single task or a list of tasks to set_downstream.
Avoid reversing task order by calling set_downstream on the wrong task.
Consider using the newer >> operator as a clearer alternative to set_downstream.