0
0
Apache Airflowdevops~30 mins

Kubernetes executor for dynamic scaling in Apache Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
Kubernetes Executor for Dynamic Scaling in Airflow
📖 Scenario: You are setting up Apache Airflow to run tasks using the Kubernetes executor. This setup allows Airflow to create new pods dynamically for each task, helping your system scale automatically based on workload.Imagine you manage a bakery that gets many orders. Instead of baking all cakes in one oven, you want to start new ovens (pods) only when needed. This project will help you configure Airflow to do just that with Kubernetes.
🎯 Goal: Build a simple Airflow configuration that uses the Kubernetes executor to run tasks dynamically on Kubernetes pods.You will create the initial Airflow configuration, add Kubernetes executor settings, write a simple DAG that runs a task, and finally run the DAG to see dynamic pod creation.
📋 What You'll Learn
Create an Airflow configuration dictionary with basic settings
Add Kubernetes executor specific configuration variables
Write a simple Airflow DAG with one task using the Kubernetes executor
Print the DAG's task status to confirm dynamic pod execution
💡 Why This Matters
🌍 Real World
Using the Kubernetes executor in Airflow helps teams run many tasks in parallel by creating pods dynamically. This is like opening new ovens only when needed in a bakery, saving resources and time.
💼 Career
Many DevOps and data engineering roles require knowledge of Airflow and Kubernetes. Understanding how to configure Airflow with Kubernetes executor is a valuable skill for managing scalable workflows.
Progress0 / 4 steps
1
Create basic Airflow configuration dictionary
Create a dictionary called airflow_config with these exact entries: 'executor': 'SequentialExecutor', 'dags_folder': '/usr/local/airflow/dags', and 'sql_alchemy_conn': 'sqlite:////usr/local/airflow/airflow.db'.
Apache Airflow
Hint

Use curly braces {} to create the dictionary and include the exact keys and values as strings.

2
Add Kubernetes executor configuration
Add these two entries to the existing airflow_config dictionary: 'executor': 'KubernetesExecutor' to replace the old executor, and 'kube_namespace': 'airflow' to specify the Kubernetes namespace.
Apache Airflow
Hint

Replace the 'executor' value and add the 'kube_namespace' key with value 'airflow'.

3
Write a simple Airflow DAG using Kubernetes executor
Write a Python code snippet that imports DAG and PythonOperator from airflow, creates a DAG named example_k8s_dag with default args including start_date as datetime(2024, 1, 1), and adds one PythonOperator task called print_hello that runs a function printing 'Hello from Kubernetes Executor'.
Apache Airflow
Hint

Use PythonOperator with python_callable=print_hello and assign it to the DAG.

4
Print the task status to confirm dynamic pod execution
Write a print statement that outputs the string 'Task print_hello is ready to run on Kubernetes Executor'.
Apache Airflow
Hint

Use print() with the exact string to confirm the task setup.