0
0
Apache Airflowdevops~30 mins

Default args and DAG parameters in Apache Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
Default args and DAG parameters
📖 Scenario: You are setting up a simple Airflow DAG to automate a daily data backup task. To keep your code clean and avoid repetition, you want to use default_args to set common parameters for your DAG and tasks.
🎯 Goal: Create an Airflow DAG named daily_backup that runs daily at 2 AM. Use default_args to set the owner to data_engineer and the start date to January 1, 2024. Then, define a simple task that prints 'Backing up data...'.
📋 What You'll Learn
Create a dictionary called default_args with keys 'owner' and 'start_date' with exact values 'data_engineer' and datetime(2024, 1, 1) respectively.
Create a DAG named daily_backup with default_args=default_args and schedule_interval='0 2 * * *'.
Define a PythonOperator task named backup_task that runs a Python function printing 'Backing up data...'.
Print the DAG's dag_id to confirm setup.
💡 Why This Matters
🌍 Real World
Using <code>default_args</code> helps keep Airflow DAGs clean and consistent, especially when many tasks share the same settings like owner, retries, or start date.
💼 Career
Understanding DAG parameters and default arguments is essential for data engineers and DevOps professionals managing automated workflows with Airflow.
Progress0 / 4 steps
1
Create default_args dictionary
Create a dictionary called default_args with these exact entries: 'owner': 'data_engineer' and 'start_date': datetime(2024, 1, 1). Import datetime from the datetime module.
Apache Airflow
Need a hint?

Use default_args = {'owner': 'data_engineer', 'start_date': datetime(2024, 1, 1)}.

2
Create the DAG with default_args
Import DAG from airflow. Create a DAG named daily_backup using default_args=default_args and schedule_interval='0 2 * * *'.
Apache Airflow
Need a hint?

Use daily_backup = DAG('daily_backup', default_args=default_args, schedule_interval='0 2 * * *').

3
Define the backup task
Import PythonOperator from airflow.operators.python. Define a Python function called backup_function that prints 'Backing up data...'. Then create a PythonOperator task named backup_task using task_id='backup_task', python_callable=backup_function, and dag=daily_backup.
Apache Airflow
Need a hint?

Define a function that prints the message, then create a PythonOperator with the given parameters.

4
Print the DAG ID
Write a print statement to display the dag_id of the DAG daily_backup.
Apache Airflow
Need a hint?

Use print(daily_backup.dag_id) to show the DAG ID.