How to Use Multiple Outputs Taskflow in Airflow
In Airflow's TaskFlow API, you can use the
@task decorator with the multiple_outputs=True parameter to return a dictionary of outputs from a task. This allows downstream tasks to access each output by key, enabling clear and flexible data passing between tasks.Syntax
The @task decorator in Airflow can be used with the multiple_outputs=True argument to indicate that the task returns a dictionary. Each key-value pair in the dictionary becomes a separate output accessible by downstream tasks.
Example parts:
@task(multiple_outputs=True): Marks the function as a task that returns multiple outputs.- Function returns a dictionary: keys are output names, values are output data.
- Downstream tasks receive outputs by referencing keys.
python
from airflow.decorators import task @task(multiple_outputs=True) def generate_outputs(): return {'output1': 'data1', 'output2': 42}
Example
This example shows a task returning multiple outputs as a dictionary, and a downstream task accessing these outputs by their keys.
python
from airflow import DAG from airflow.decorators import task from airflow.utils.dates import days_ago with DAG(dag_id='multiple_outputs_example', start_date=days_ago(1), schedule_interval=None) as dag: @task(multiple_outputs=True) def generate_outputs(): return {'output1': 'hello', 'output2': 123} @task def consume_outputs(output1, output2): print(f"Output1: {output1}") print(f"Output2: {output2}") outputs = generate_outputs() consume_outputs(outputs['output1'], outputs['output2'])
Output
Output1: hello
Output2: 123
Common Pitfalls
Common mistakes when using multiple outputs in TaskFlow include:
- Not setting
multiple_outputs=Truein the@taskdecorator, which causes Airflow to treat the output as a single value instead of a dictionary. - Returning a non-dictionary type when
multiple_outputs=Trueis set, leading to errors. - Incorrectly accessing outputs without using keys, which causes runtime errors.
python
from airflow.decorators import task # Wrong: missing multiple_outputs=True @task def wrong_task(): return {'key': 'value'} # This will be treated as one output, not multiple # Right: @task(multiple_outputs=True) def right_task(): return {'key': 'value'}
Quick Reference
| Feature | Description |
|---|---|
| @task(multiple_outputs=True) | Decorator to enable multiple outputs from a task |
| Return type | Must be a dictionary with keys as output names |
| Access outputs | Use keys like outputs['key'] in downstream tasks |
| Use case | Passing multiple data points cleanly between tasks |
Key Takeaways
Use @task(multiple_outputs=True) to return multiple outputs as a dictionary.
Always return a dictionary when multiple_outputs=True is set.
Access each output by its key in downstream tasks.
Not setting multiple_outputs=True causes Airflow to treat output as a single value.
This pattern improves clarity and flexibility in task data passing.