Default args and DAG parameters
📖 Scenario: You are setting up a simple Airflow DAG to automate a daily data backup task. To keep your code clean and avoid repetition, you want to use default_args to set common parameters for your DAG and tasks.
🎯 Goal: Create an Airflow DAG named daily_backup that runs daily at 2 AM. Use default_args to set the owner to data_engineer and the start date to January 1, 2024. Then, define a simple task that prints 'Backing up data...'.
📋 What You'll Learn
Create a dictionary called
default_args with keys 'owner' and 'start_date' with exact values 'data_engineer' and datetime(2024, 1, 1) respectively.Create a DAG named
daily_backup with default_args=default_args and schedule_interval='0 2 * * *'.Define a PythonOperator task named
backup_task that runs a Python function printing 'Backing up data...'.Print the DAG's
dag_id to confirm setup.💡 Why This Matters
🌍 Real World
Using <code>default_args</code> helps keep Airflow DAGs clean and consistent, especially when many tasks share the same settings like owner, retries, or start date.
💼 Career
Understanding DAG parameters and default arguments is essential for data engineers and DevOps professionals managing automated workflows with Airflow.
Progress0 / 4 steps