0
0
AirflowHow-ToBeginner · 4 min read

How to Use TaskGroup in Airflow for Better DAG Organization

Use TaskGroup in Airflow to group related tasks inside a DAG for better organization and visualization. Wrap tasks inside a with TaskGroup('group_id') as group: block and then add the group to your DAG.
📐

Syntax

The TaskGroup is used inside a DAG to group tasks. You create it with TaskGroup(group_id, tooltip=None). Inside the with block, define tasks that belong to this group. This helps to visually cluster tasks in the Airflow UI.

  • group_id: Unique identifier for the task group.
  • tooltip: Optional description shown on hover in UI.
python
from airflow import DAG
from airflow.utils.task_group import TaskGroup
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime

default_args = {'start_date': datetime(2024, 1, 1)}

dag = DAG('example_taskgroup', default_args=default_args, schedule_interval='@daily')

with dag:
    start = DummyOperator(task_id='start')

    with TaskGroup('group1', tooltip='Tasks in group 1') as group1:
        task_a = DummyOperator(task_id='task_a')
        task_b = DummyOperator(task_id='task_b')

    end = DummyOperator(task_id='end')

    start >> group1 >> end
💻

Example

This example shows how to create a DAG with a TaskGroup named 'group1' containing two dummy tasks. The tasks are executed after a start task and before an end task, demonstrating task grouping and dependencies.

python
from airflow import DAG
from airflow.utils.task_group import TaskGroup
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime

default_args = {'start_date': datetime(2024, 1, 1)}

dag = DAG('taskgroup_example', default_args=default_args, schedule_interval='@daily')

with dag:
    start = DummyOperator(task_id='start')

    with TaskGroup('group1') as group1:
        task_a = DummyOperator(task_id='task_a')
        task_b = DummyOperator(task_id='task_b')

    end = DummyOperator(task_id='end')

    start >> group1 >> end
Output
INFO - Starting DAG run for taskgroup_example INFO - Executing task start INFO - Executing task task_a INFO - Executing task task_b INFO - Executing task end INFO - DAG run completed successfully
⚠️

Common Pitfalls

Common mistakes when using TaskGroup include:

  • Not using with block, which breaks grouping.
  • Using duplicate group_id values in the same DAG.
  • Incorrect task dependencies outside the group causing unexpected execution order.

Always ensure tasks inside a TaskGroup are defined within the with TaskGroup() block and that group_id is unique.

python
from airflow import DAG
from airflow.utils.task_group import TaskGroup
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime

default_args = {'start_date': datetime(2024, 1, 1)}

dag = DAG('wrong_taskgroup', default_args=default_args, schedule_interval='@daily')

with dag:
    start = DummyOperator(task_id='start')

    # Wrong: tasks defined outside the TaskGroup context
    group1 = TaskGroup('group1')
    task_a = DummyOperator(task_id='task_a', task_group=group1)
    task_b = DummyOperator(task_id='task_b', task_group=group1)

    end = DummyOperator(task_id='end')

    start >> group1 >> end

# Right way:
with dag:
    start = DummyOperator(task_id='start')

    with TaskGroup('group1') as group1:
        task_a = DummyOperator(task_id='task_a')
        task_b = DummyOperator(task_id='task_b')

    end = DummyOperator(task_id='end')

    start >> group1 >> end
📊

Quick Reference

FeatureDescription
TaskGroup(group_id, tooltip=None)Create a group of tasks with an optional tooltip.
Use with TaskGroup(...) as group:Define tasks inside this block to belong to the group.
Unique group_idEach TaskGroup must have a unique ID within the DAG.
Task dependenciesSet dependencies between groups or tasks as usual.
Visual groupingGroups appear as collapsible boxes in Airflow UI for clarity.

Key Takeaways

Use TaskGroup with a unique group_id inside a DAG to organize related tasks visually.
Define tasks inside the with TaskGroup(...) block to include them in the group.
Set dependencies between TaskGroups and tasks normally to control execution order.
Avoid defining tasks outside the TaskGroup context to prevent grouping errors.
TaskGroups improve DAG readability and make complex workflows easier to manage.