How to Use chain() in Apache Airflow for Task Dependencies
In Apache Airflow, use the
chain() function to set a sequence of task dependencies in a simple way by passing tasks in the order they should run. This replaces multiple set_downstream() or set_upstream() calls, making your DAG code cleaner and easier to read.Syntax
The chain() function takes multiple tasks or lists of tasks as arguments and links them in the order provided. Each task will be set as downstream of the previous one.
Basic syntax:
chain(task1, task2, task3, ...)
Where:
task1, task2, task3are Airflow tasks or lists of tasks.- Tasks are linked so
task1runs beforetask2, which runs beforetask3, and so on.
python
from airflow.models.baseoperator import chain chain(task1, task2, task3)
Example
This example shows how to use chain() to link three tasks in a DAG so they run one after another.
python
from airflow import DAG from airflow.operators.bash import BashOperator from airflow.utils.dates import days_ago from airflow.models.baseoperator import chain default_args = { 'start_date': days_ago(1), } dag = DAG('chain_example_dag', default_args=default_args, schedule_interval='@daily') # Define tasks start = BashOperator(task_id='start', bash_command='echo Start', dag=dag) process = BashOperator(task_id='process', bash_command='echo Processing', dag=dag) end = BashOperator(task_id='end', bash_command='echo End', dag=dag) # Use chain to set dependencies chain(start, process, end)
Output
When the DAG runs, tasks execute in order: start โ process โ end.
Common Pitfalls
Common mistakes when using chain() include:
- Passing tasks in the wrong order, which causes incorrect execution sequence.
- Mixing single tasks and lists incorrectly, leading to unexpected dependencies.
- Not importing
chainfromairflow.models.baseoperator.
Example of a wrong usage and the correct fix:
python
# Wrong: tasks order reversed chain(end, process, start) # This runs end before process and start, which is wrong # Right: correct order chain(start, process, end)
Quick Reference
| Function | Description | Example |
|---|---|---|
| chain | Links tasks in sequence | chain(task1, task2, task3) |
| set_downstream | Sets one task downstream of another | task1.set_downstream(task2) |
| set_upstream | Sets one task upstream of another | task2.set_upstream(task1) |
Key Takeaways
Use chain() to link multiple tasks in a clear, linear order.
Pass tasks to chain() in the exact order you want them to run.
Import chain from airflow.models.baseoperator before using it.
Avoid mixing lists and single tasks incorrectly in chain arguments.
chain() simplifies DAG code by replacing multiple set_downstream calls.