How to Schedule a DAG in Apache Airflow: Syntax and Examples
In Apache Airflow, you schedule a DAG by setting the
schedule_interval parameter in the DAG definition using a cron expression or preset strings like @daily. This tells Airflow when to run the DAG automatically.Syntax
The schedule_interval parameter in the DAG constructor defines when the DAG runs. It accepts:
- Cron expressions like
"0 12 * * *"to run at noon daily. - Preset strings like
@hourly,@daily,@weekly. - None to disable automatic scheduling.
Example: schedule_interval='0 6 * * *' runs the DAG every day at 6 AM.
python
from airflow import DAG from airflow.operators.dummy import DummyOperator from datetime import datetime dag = DAG( dag_id='example_dag', start_date=datetime(2024, 1, 1), schedule_interval='0 6 * * *' # Runs daily at 6 AM ) task = DummyOperator(task_id='dummy_task', dag=dag)
Example
This example shows a DAG scheduled to run every day at midnight using the preset @daily. It contains a simple dummy task.
python
from airflow import DAG from airflow.operators.dummy import DummyOperator from datetime import datetime dag = DAG( dag_id='daily_dag', start_date=datetime(2024, 1, 1), schedule_interval='@daily' # Runs once every day at midnight ) task = DummyOperator(task_id='start', dag=dag)
Output
No direct output; Airflow scheduler triggers the DAG daily at midnight.
Common Pitfalls
- Wrong start_date: Setting
start_datein the future delays DAG runs until that date. - Using
schedule_interval=None: disables scheduling; DAG runs only when triggered manually. - Misunderstanding cron syntax: Incorrect cron expressions cause unexpected schedules.
- Timezone issues: Airflow uses UTC by default; local time differences can confuse scheduling.
python
from airflow import DAG from airflow.operators.dummy import DummyOperator from datetime import datetime, timedelta # Wrong: start_date in the future delays runs wrong_dag = DAG( dag_id='wrong_start_date', start_date=datetime(2099, 1, 1), # Far future date schedule_interval='@daily' ) # Correct: start_date in the past or present correct_dag = DAG( dag_id='correct_start_date', start_date=datetime(2024, 1, 1), schedule_interval='@daily' ) # Wrong: disables scheduling manual_dag = DAG( dag_id='manual_trigger_only', start_date=datetime(2024, 1, 1), schedule_interval=None # No automatic runs ) # Correct: use valid cron or presets for scheduling
Quick Reference
Common schedule_interval presets and their meanings:
| Preset | Meaning |
|---|---|
| @once | Run once immediately after start_date |
| @hourly | Run every hour |
| @daily | Run once a day at midnight UTC |
| @weekly | Run once a week on Sunday at midnight UTC |
| @monthly | Run once a month on the first day at midnight UTC |
| cron expression | Custom schedule using standard cron syntax |
Key Takeaways
Set the schedule_interval parameter in your DAG to control when it runs automatically.
Use cron expressions or preset strings like @daily for easy scheduling.
Ensure start_date is in the past or present to avoid delayed runs.
Setting schedule_interval=None disables automatic scheduling.
Remember Airflow uses UTC timezone by default for scheduling.