Which of the following is the most effective way to reduce cloud costs when running Airflow workflows?
Think about when cloud providers offer lower prices and how scheduling can help.
Scheduling tasks during off-peak hours can leverage cheaper cloud rates, reducing costs without sacrificing reliability. Increasing parallelism or keeping workers always on can increase costs. Disabling retries may cause failures and rework.
What is the output of the following command when no Airflow workers are currently running?
airflow celery status
Think about what the status command shows when no workers are connected.
The 'airflow celery status' command shows registered workers. If none are running, it outputs 'No nodes currently registered.'
Which Airflow configuration snippet correctly enables automatic scaling down of Celery workers to save cloud costs?
Check the correct section and the order of max and min workers in worker_autoscale.
The 'worker_autoscale' setting under the [celery] section uses the format 'min_workers,max_workers'. 'worker_autoscale = 3,10' means scale between 3 and 10 workers automatically.
You notice your cloud bill increased sharply after deploying Airflow. Which of the following is the most likely cause?
Consider what causes continuous resource usage and costs.
If Airflow workers remain running all the time, they consume cloud resources continuously, increasing costs. Infrequent scheduling reduces costs. LocalExecutor runs tasks locally, not necessarily increasing cloud costs. Logs stored locally do not affect cloud costs.
You want to optimize an Airflow DAG to minimize cloud costs by reducing unnecessary task runs. Which workflow change achieves this best?
Think about avoiding running tasks that are not needed.
Using task triggers to run downstream tasks only if upstream tasks succeed prevents unnecessary task runs, saving cloud costs. Running all tasks daily or disabling dependencies can increase costs. Increasing retries may increase costs due to repeated runs.