0
0
AirflowDebug / FixBeginner · 4 min read

How to Debug DAG in Airflow: Step-by-Step Guide

To debug a DAG in Airflow, first check the task logs in the Airflow UI for error details. Use airflow tasks test command to run tasks locally and identify issues without triggering the whole DAG.
🔍

Why This Happens

Errors in Airflow DAGs often happen because of syntax mistakes, missing dependencies, or incorrect task configurations. These cause tasks to fail or the DAG to not load properly.

python
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime

with DAG('example_dag', start_date=datetime(2024, 1, 1)) as dag:
    task1 = BashOperator(
        task_id='print_date',
        bash_command='datee'
    )
Output
bash: datee: command not found [2024-xx-xx xx:xx:xx] Task failed with exit code 127
🔧

The Fix

Fix the typo in the bash_command to the correct command date. This ensures the task runs successfully without errors.

python
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime

with DAG('example_dag', start_date=datetime(2024, 1, 1)) as dag:
    task1 = BashOperator(
        task_id='print_date',
        bash_command='date'
    )
Output
[2024-xx-xx xx:xx:xx] Running command: date Thu Apr 25 12:00:00 UTC 2024 [2024-xx-xx xx:xx:xx] Task succeeded
🛡️

Prevention

To avoid DAG errors, always test tasks locally using airflow tasks test <dag_id> <task_id> <execution_date>. Use the Airflow UI logs to catch errors early. Follow best practices like clear task naming, proper dependencies, and version control for DAG files.

⚠️

Related Errors

  • Import Errors: Caused by missing Python packages or wrong imports; fix by installing dependencies and checking import paths.
  • Schedule Interval Issues: Incorrect cron expressions can prevent DAG runs; validate cron syntax.
  • Connection Failures: Tasks needing external services fail if connections are misconfigured; verify Airflow connections.

Key Takeaways

Check task logs in Airflow UI to find error details quickly.
Use 'airflow tasks test' to run tasks locally and debug without triggering the full DAG.
Fix syntax and command typos to prevent task failures.
Follow best practices like clear naming and dependency management to avoid errors.
Validate external connections and cron schedules to ensure smooth DAG runs.