Recall & Review
beginner
What is high availability (HA) in Airflow?
High availability in Airflow means setting up the system so it keeps running without interruption, even if some parts fail. It ensures workflows continue without downtime.
Click to reveal answer
intermediate
Name two key components to configure for Airflow high availability.
1. Multiple Airflow schedulers running in parallel.
2. A highly available metadata database (like PostgreSQL with replication).
Click to reveal answer
intermediate
Why use a message broker like Redis or RabbitMQ in Airflow HA setup?
A message broker helps coordinate tasks between multiple schedulers and workers, ensuring task distribution and avoiding conflicts.
Click to reveal answer
beginner
What role does the metadata database play in Airflow high availability?
The metadata database stores the state of all workflows and tasks. Making it highly available prevents data loss and keeps Airflow running smoothly.
Click to reveal answer
beginner
How does running multiple schedulers improve Airflow availability?
Multiple schedulers share the workload and take over if one fails, so workflows keep running without interruption.
Click to reveal answer
Which component is essential for Airflow high availability to avoid a single point of failure?
✗ Incorrect
A replicated metadata database prevents data loss and downtime if one database instance fails.
What is the purpose of running multiple Airflow schedulers in HA setup?
✗ Incorrect
Multiple schedulers share the scheduling load and provide backup if one scheduler fails.
Which message broker is commonly used in Airflow HA for task coordination?
✗ Incorrect
Redis is a popular message broker used to coordinate tasks between schedulers and workers.
What happens if the Airflow metadata database is not highly available?
✗ Incorrect
Without a highly available database, Airflow can lose critical state information causing failures.
Which of these is NOT a benefit of Airflow high availability?
✗ Incorrect
High availability improves uptime and failover but does not fix bugs in workflow code.
Explain how multiple schedulers and a replicated metadata database work together to provide high availability in Airflow.
Think about how tasks keep running even if one part stops working.
You got /4 concepts.
Describe the role of a message broker in an Airflow high availability setup.
It's like a traffic controller for tasks.
You got /4 concepts.