What is end_date in Airflow: Definition and Usage
end_date is a parameter that defines the last date and time a task or DAG should run. It acts like a stop sign, telling Airflow not to schedule any runs after this date.How It Works
Think of end_date in Airflow as a calendar deadline for your tasks or workflows. When you set an end_date, Airflow will schedule runs only up to that date and no further. This helps control how long your workflows keep running.
Imagine you have a daily report that should only run until the end of the year. By setting end_date to December 31st, Airflow will automatically stop scheduling the report after that day, so you don’t have to manually turn it off.
Example
This example shows a DAG with a start_date and an end_date. The DAG runs daily but stops after the end_date.
from airflow import DAG from airflow.operators.dummy import DummyOperator from datetime import datetime with DAG( dag_id='example_end_date_dag', start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 5), schedule_interval='@daily', catchup=True ) as dag: task = DummyOperator(task_id='dummy_task')
When to Use
Use end_date when you want to limit how long a DAG or task runs. This is useful for temporary workflows, seasonal jobs, or data pipelines that only need to run for a fixed period.
For example, if you have a data cleanup job that should only run during a migration window, setting an end_date ensures it stops automatically after the migration ends.
Key Points
- end_date stops scheduling after the specified date and time.
- It works together with
start_dateandschedule_intervalto control DAG runs. - If
end_dateis not set, the DAG or task can run indefinitely. - Setting
end_datehelps avoid unwanted runs and resource use.
Key Takeaways
end_date defines the last date a DAG or task will run in Airflow.end_date, workflows can run indefinitely if scheduled.start_date and schedule_interval to manage run timing.