0
0
Apache Airflowdevops~30 mins

Cron expressions in Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
Cron expressions in Airflow
📖 Scenario: You are setting up scheduled tasks in Apache Airflow to automate data workflows. You want to learn how to use cron expressions to control when your tasks run.
🎯 Goal: Build a simple Airflow DAG that uses a cron expression to schedule a task to run every day at 3:30 AM.
📋 What You'll Learn
Create a DAG with the exact id daily_report
Use a cron expression 30 3 * * * for the schedule_interval
Define a PythonOperator task with the id print_hello
The task should run a Python function that prints 'Hello Airflow'
Print the DAG's schedule_interval at the end
💡 Why This Matters
🌍 Real World
Scheduling workflows to run automatically at specific times is common in data engineering and DevOps. Airflow uses cron expressions to control these schedules.
💼 Career
Understanding cron expressions in Airflow is essential for roles like Data Engineer, DevOps Engineer, and Automation Specialist to automate and manage workflows reliably.
Progress0 / 4 steps
1
Create the Airflow DAG with the correct id
Import DAG from airflow and create a DAG object called dag with the id 'daily_report'. Use start_date as datetime(2024, 1, 1) and catchup=false.
Apache Airflow
Need a hint?

Use DAG(dag_id='daily_report', start_date=datetime(2024, 1, 1), catchup=false) to create the DAG.

2
Add the cron expression schedule_interval
Add the parameter schedule_interval='30 3 * * *' to the existing DAG creation line to schedule the DAG to run daily at 3:30 AM.
Apache Airflow
Need a hint?

Add schedule_interval='30 3 * * *' inside the DAG() parentheses.

3
Define a PythonOperator task that prints 'Hello Airflow'
Import PythonOperator from airflow.operators.python. Define a Python function called print_hello that prints 'Hello Airflow'. Then create a PythonOperator task with task_id='print_hello', python_callable=print_hello, and assign it to the dag.
Apache Airflow
Need a hint?

Define the function first, then create the PythonOperator with the correct parameters.

4
Print the DAG's schedule_interval
Write a print statement to display the schedule_interval of the dag object.
Apache Airflow
Need a hint?

Use print(dag.schedule_interval) to show the cron expression.