0
0
Apache Airflowdevops~5 mins

Task documentation and tags in Apache Airflow - Commands & Configuration

Choose your learning style9 modes available
Introduction
When you create tasks in Airflow, you want to explain what each task does and organize them well. Task documentation helps others understand your workflow. Tags help group and find tasks easily in the Airflow UI.
When you want to explain the purpose of a task clearly for your team.
When you have many tasks and want to filter or search them by category.
When you want to add extra notes or instructions to a task for future reference.
When you want to organize tasks by type, like 'data-processing' or 'notification'.
When you want to improve the readability and maintainability of your DAGs.
Config File - my_dag.py
my_dag.py
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime

def greet():
    print('Hello from the task!')

def farewell():
    print('Goodbye from the task!')

with DAG(
    dag_id='example_task_docs_tags',
    start_date=datetime(2024, 1, 1),
    schedule_interval='@daily',
    catchup=False
) as dag:

    greet_task = PythonOperator(
        task_id='greet',
        python_callable=greet,
        doc_md='''
        ### Greet Task
        This task prints a greeting message to the logs.
        It helps verify the DAG is running correctly.
        ''',
        tags=['greeting', 'example']
    )

    farewell_task = PythonOperator(
        task_id='farewell',
        python_callable=farewell,
        doc_md='''
        ### Farewell Task
        This task prints a farewell message to the logs.
        It runs after the greet task.
        ''',
        tags=['farewell', 'example']
    )

    greet_task >> farewell_task

This DAG file defines two tasks using PythonOperator.

doc_md adds markdown documentation visible in the Airflow UI for each task.

tags assign labels to tasks for easy filtering and grouping in the UI.

The tasks are connected so that farewell_task runs after greet_task.

Commands
List all DAGs available in Airflow to confirm your DAG is recognized.
Terminal
airflow dags list
Expected OutputExpected
example_task_docs_tags
List all tasks in the example_task_docs_tags DAG to see their IDs and confirm they are loaded.
Terminal
airflow tasks list example_task_docs_tags
Expected OutputExpected
greet farewell
Run the greet task manually for the date 2024-01-01 to see its output and verify it works.
Terminal
airflow tasks test example_task_docs_tags greet 2024-01-01
Expected OutputExpected
[2024-01-01 00:00:00,000] {python.py:114} INFO - Hello from the task! [2024-01-01 00:00:00,001] {taskinstance.py:1234} INFO - Task exited with return code 0
Run the farewell task manually for the date 2024-01-01 to see its output and verify it works.
Terminal
airflow tasks test example_task_docs_tags farewell 2024-01-01
Expected OutputExpected
[2024-01-01 00:00:00,000] {python.py:114} INFO - Goodbye from the task! [2024-01-01 00:00:00,001] {taskinstance.py:1234} INFO - Task exited with return code 0
Key Concept

If you remember nothing else, remember: task documentation explains what a task does, and tags help you find and organize tasks easily in the Airflow UI.

Common Mistakes
Not adding doc_md to tasks or leaving it empty
Without documentation, others cannot understand the task's purpose, making maintenance harder.
Always add clear doc_md strings to explain what each task does.
Using inconsistent or no tags on tasks
Without tags, filtering and grouping tasks in the UI becomes difficult, especially in large DAGs.
Use meaningful and consistent tags to group related tasks.
Adding tags or doc_md outside the task definition
Tags and documentation must be inside the task operator to be recognized by Airflow.
Always include doc_md and tags as parameters inside the task operator constructor.
Summary
Define task documentation using the doc_md parameter to explain task purpose.
Use tags parameter to label tasks for easy filtering in the Airflow UI.
Verify tasks and DAGs using airflow CLI commands like dags list and tasks test.