0
0
AirflowConceptBeginner · 3 min read

What is max_active_runs in Airflow: Explanation and Usage

In Apache Airflow, max_active_runs is a setting that limits how many instances of a DAG can run at the same time. It helps control concurrency by preventing too many runs from executing simultaneously, which can protect resources and avoid overload.
⚙️

How It Works

Imagine you have a kitchen where you can cook several dishes at once, but only a limited number of burners. max_active_runs is like the number of burners you allow to be used simultaneously for one recipe (DAG).

When you set max_active_runs for a DAG, Airflow will only start that many runs of the DAG at the same time. If the limit is reached, new runs wait in line until a running one finishes. This prevents your system from being overwhelmed by too many parallel tasks from the same workflow.

This setting is useful to balance workload and resource use, ensuring your Airflow environment stays stable and efficient.

💻

Example

This example shows how to set max_active_runs in a DAG definition to limit concurrent runs to 2.

python
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime

with DAG(
    'example_max_active_runs',
    start_date=datetime(2024, 1, 1),
    schedule_interval='@daily',
    max_active_runs=2,
    catchup=False
) as dag:
    task1 = BashOperator(
        task_id='print_date',
        bash_command='date'
    )
Output
No direct output; DAG will only allow 2 active runs at the same time when triggered or scheduled.
🎯

When to Use

Use max_active_runs when you want to control how many instances of a DAG run simultaneously to avoid resource exhaustion or conflicts.

For example, if your DAG triggers heavy database queries or external API calls, limiting concurrent runs prevents overloading those systems.

It is also helpful when you want to ensure sequential processing or when downstream systems have limited capacity.

Key Points

  • max_active_runs limits concurrent DAG runs.
  • It helps manage resource usage and system stability.
  • Set it in the DAG definition as an integer.
  • New runs wait if the limit is reached.
  • Useful for heavy or resource-sensitive workflows.

Key Takeaways

max_active_runs controls how many DAG runs can execute at once.
Setting it prevents overloading your system with too many parallel runs.
It is configured in the DAG file as an integer parameter.
Use it to protect external systems or manage resource-heavy workflows.
Runs beyond the limit wait until active runs finish.