How to Use TaskGroup in Airflow for Better DAG Organization
Use
TaskGroup in Airflow to group related tasks inside a DAG for better organization and visualization. Wrap tasks inside a with TaskGroup('group_id') as group: block and then add the group to your DAG.Syntax
The TaskGroup is used inside a DAG to group tasks. You create it with TaskGroup(group_id, tooltip=None). Inside the with block, define tasks that belong to this group. This helps to visually cluster tasks in the Airflow UI.
- group_id: Unique identifier for the task group.
- tooltip: Optional description shown on hover in UI.
python
from airflow import DAG from airflow.utils.task_group import TaskGroup from airflow.operators.dummy_operator import DummyOperator from datetime import datetime default_args = {'start_date': datetime(2024, 1, 1)} dag = DAG('example_taskgroup', default_args=default_args, schedule_interval='@daily') with dag: start = DummyOperator(task_id='start') with TaskGroup('group1', tooltip='Tasks in group 1') as group1: task_a = DummyOperator(task_id='task_a') task_b = DummyOperator(task_id='task_b') end = DummyOperator(task_id='end') start >> group1 >> end
Example
This example shows how to create a DAG with a TaskGroup named 'group1' containing two dummy tasks. The tasks are executed after a start task and before an end task, demonstrating task grouping and dependencies.
python
from airflow import DAG from airflow.utils.task_group import TaskGroup from airflow.operators.dummy_operator import DummyOperator from datetime import datetime default_args = {'start_date': datetime(2024, 1, 1)} dag = DAG('taskgroup_example', default_args=default_args, schedule_interval='@daily') with dag: start = DummyOperator(task_id='start') with TaskGroup('group1') as group1: task_a = DummyOperator(task_id='task_a') task_b = DummyOperator(task_id='task_b') end = DummyOperator(task_id='end') start >> group1 >> end
Output
INFO - Starting DAG run for taskgroup_example
INFO - Executing task start
INFO - Executing task task_a
INFO - Executing task task_b
INFO - Executing task end
INFO - DAG run completed successfully
Common Pitfalls
Common mistakes when using TaskGroup include:
- Not using
withblock, which breaks grouping. - Using duplicate
group_idvalues in the same DAG. - Incorrect task dependencies outside the group causing unexpected execution order.
Always ensure tasks inside a TaskGroup are defined within the with TaskGroup() block and that group_id is unique.
python
from airflow import DAG from airflow.utils.task_group import TaskGroup from airflow.operators.dummy_operator import DummyOperator from datetime import datetime default_args = {'start_date': datetime(2024, 1, 1)} dag = DAG('wrong_taskgroup', default_args=default_args, schedule_interval='@daily') with dag: start = DummyOperator(task_id='start') # Wrong: tasks defined outside the TaskGroup context group1 = TaskGroup('group1') task_a = DummyOperator(task_id='task_a', task_group=group1) task_b = DummyOperator(task_id='task_b', task_group=group1) end = DummyOperator(task_id='end') start >> group1 >> end # Right way: with dag: start = DummyOperator(task_id='start') with TaskGroup('group1') as group1: task_a = DummyOperator(task_id='task_a') task_b = DummyOperator(task_id='task_b') end = DummyOperator(task_id='end') start >> group1 >> end
Quick Reference
| Feature | Description |
|---|---|
| TaskGroup(group_id, tooltip=None) | Create a group of tasks with an optional tooltip. |
| Use with TaskGroup(...) as group: | Define tasks inside this block to belong to the group. |
| Unique group_id | Each TaskGroup must have a unique ID within the DAG. |
| Task dependencies | Set dependencies between groups or tasks as usual. |
| Visual grouping | Groups appear as collapsible boxes in Airflow UI for clarity. |
Key Takeaways
Use TaskGroup with a unique group_id inside a DAG to organize related tasks visually.
Define tasks inside the with TaskGroup(...) block to include them in the group.
Set dependencies between TaskGroups and tasks normally to control execution order.
Avoid defining tasks outside the TaskGroup context to prevent grouping errors.
TaskGroups improve DAG readability and make complex workflows easier to manage.