0
0
Apache Airflowdevops~5 mins

Cron expressions in Airflow - Commands & Configuration

Choose your learning style9 modes available
Introduction
Scheduling tasks to run automatically at specific times can be tricky. Cron expressions help by letting you set exact schedules in Airflow, so your workflows run when you want without manual effort.
When you want a task to run every day at 3 AM to process daily reports.
When you need a job to run every Monday at 9 AM to start weekly data backups.
When you want to run a task every 15 minutes to check for new data arrivals.
When you want to run a task at midnight on the first day of every month for monthly billing.
When you want to schedule a task to run every hour during business hours only.
Config File - my_dag.py
my_dag.py
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime

with DAG(
    dag_id='cron_expression_example',
    start_date=datetime(2024, 1, 1),
    schedule_interval='0 3 * * *',  # Runs daily at 3 AM
    catchup=False
) as dag:
    task = BashOperator(
        task_id='print_date',
        bash_command='date'
    )

This DAG file defines a simple Airflow workflow.

dag_id names the workflow.

start_date sets when the schedule starts.

schedule_interval uses a cron expression to run the task daily at 3 AM.

catchup=False means it won't run missed schedules from the past.

The BashOperator runs the date command as a task.

Commands
Lists all DAGs currently available in Airflow to verify your DAG is recognized.
Terminal
airflow dags list
Expected OutputExpected
dag_id | filepath cron_expression_example | /usr/local/airflow/dags/my_dag.py
Manually triggers the DAG to run immediately, useful for testing your schedule and tasks.
Terminal
airflow dags trigger cron_expression_example
Expected OutputExpected
Created <DagRun cron_expression_example @ 2024-06-01T12:00:00+00:00: manual__2024-06-01T12:00:00+00:00, externally triggered: True>
Lists all tasks inside the DAG to confirm the task names and structure.
Terminal
airflow tasks list cron_expression_example
Expected OutputExpected
print_date
Runs the 'print_date' task for the given date without scheduling, to check task behavior.
Terminal
airflow tasks test cron_expression_example print_date 2024-06-01
Expected OutputExpected
[2024-06-01 12:00:00,000] {bash.py:123} INFO - Running command: date [2024-06-01 12:00:00,100] {bash.py:130} INFO - Output: Wed Jun 1 12:00:00 UTC 2024 [2024-06-01 12:00:00,200] {taskinstance.py:123} INFO - Task succeeded
Key Concept

If you remember nothing else from this pattern, remember: cron expressions let you precisely schedule Airflow tasks by defining minute, hour, day, month, and weekday fields.

Common Mistakes
Using a cron expression with wrong field order or missing fields.
Airflow expects 5 fields in the order: minute, hour, day of month, month, day of week. Wrong order causes scheduling errors or no runs.
Always write cron expressions as 'minute hour day month weekday', for example '0 3 * * *' for daily at 3 AM.
Setting start_date in the future or forgetting catchup=False.
If start_date is in the future, the DAG won't run until that date. If catchup=True (default), Airflow tries to run all missed schedules, which can cause many backlogs.
Set start_date to a past date and use catchup=False to avoid unexpected backlogs.
Using '@daily' or '@hourly' without understanding they are shortcuts.
Shortcuts work but can be confusing if you want custom times. They also may not fit all scheduling needs.
Use full cron expressions for precise control, especially for non-standard schedules.
Summary
Define your Airflow DAG with a cron expression in the schedule parameter to control when tasks run.
Use airflow CLI commands to list DAGs, trigger runs, list tasks, and test tasks for validation.
Remember the cron format is minute, hour, day of month, month, and day of week for correct scheduling.