In Airflow, if you set catchup=True for a DAG, what behavior should you expect when the DAG is first deployed?
Think about what 'catching up' means in terms of scheduled runs.
When catchup=True, Airflow runs all missed DAG runs between the start_date and the current date to ensure no scheduled runs are skipped.
Given a DAG with catchup=False, what will be the output when running airflow dags backfill -s 2024-01-01 -e 2024-01-03 my_dag?
airflow dags backfill -s 2024-01-01 -e 2024-01-03 my_dag
Remember that backfill manually triggers runs regardless of catchup.
The backfill command manually triggers DAG runs for the specified date range, ignoring the catchup setting.
You have a DAG with catchup=True and a start_date set to 10 days ago. However, no past DAG runs are executing. What is the most likely cause?
Check if the DAG has a schedule defined.
If schedule_interval is None, the DAG is considered manual and no scheduled runs are created, so catchup has no effect.
Arrange the steps in the correct order to enable catchup on an existing DAG and ensure past runs are executed.
Think about what must be set before enabling catchup and restarting scheduler.
First ensure the start_date is correct, then enable catchup, restart scheduler to apply changes, and finally monitor runs.
You want to enable catchup=True on a DAG but avoid running all past missed DAG runs immediately. What is the best practice to achieve this?
Consider how Airflow decides which past runs to execute.
Setting the start_date to the current date means no past runs exist to catch up on, so enabling catchup won't trigger old runs.