What is the primary purpose of Apache Airflow in a DevOps environment?
Think about what Airflow helps automate and organize.
Apache Airflow is designed to automate, schedule, and monitor workflows, especially data pipelines, making complex tasks easier to manage.
What output will you see when you successfully start the Apache Airflow webserver using the command airflow webserver?
airflow webserver
Look for a message indicating the webserver is running and the port number.
When the Airflow webserver starts successfully, it listens on port 8080 by default and shows a message with the URL.
Which configuration setting in airflow.cfg correctly sets the executor to use CeleryExecutor?
[core] executor = SequentialExecutor
The executor setting defines how tasks are run in Airflow.
Setting executor = CeleryExecutor enables distributed task execution using Celery workers.
Given a DAG scheduled with schedule_interval='@daily' and a start date of 2024-06-01, when will the first DAG run be triggered?
Airflow schedules runs after the period ends.
Airflow triggers the first run at the end of the first schedule interval, so the run for 2024-06-01 is triggered at midnight on 2024-06-02.
You notice a task in your Airflow DAG is stuck in the 'queued' state indefinitely. Which is the most likely cause?
Think about what controls task execution after queuing.
If no worker is available, tasks remain queued and do not start running.