What is the primary role of the database backend in Apache Airflow?
Think about where Airflow keeps track of what tasks ran and when.
The database backend in Airflow stores metadata such as DAG runs, task instances, and scheduling info. It does not execute tasks or serve the UI.
What is the output of the command airflow db check when the database connection is healthy?
airflow db check
Look for a positive confirmation message.
The airflow db check command outputs 'Database connection successful' if the connection is healthy.
Which configuration change in airflow.cfg helps reduce database load by limiting the number of task instances fetched per query?
Focus on scheduler settings that control database queries.
The scheduler.max_tis_per_query setting limits how many task instances the scheduler fetches per database query, reducing load.
The Airflow scheduler is running slowly. Which database-related issue is most likely causing this?
Think about what happens if the database grows too large.
If the task_instance table grows too large without cleanup, queries slow down, causing scheduler delays.
Which practice is best to keep the Airflow metadata database optimized and prevent performance degradation?
Think about how to keep the database size manageable.
Regular cleanup of old metadata using airflow db cleanup or the cleanup DAG prevents database bloat and keeps performance steady.