0
0
Apache Airflowdevops~30 mins

Multi-environment deployment (dev, staging, prod) in Apache Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
Multi-environment deployment (dev, staging, prod) with Airflow
📖 Scenario: You are working as a data engineer managing workflows using Apache Airflow. You want to deploy the same workflow to three different environments: development, staging, and production. Each environment has its own configuration settings like schedule interval and retries.This project will guide you to create a simple Airflow DAG that adapts its behavior based on the environment it is deployed to.
🎯 Goal: Build an Airflow DAG that uses a configuration dictionary to set environment-specific parameters. You will create the base DAG, add environment configuration, apply the config to the DAG, and finally print the DAG details to verify the setup.
📋 What You'll Learn
Create a dictionary with environment configurations
Add a variable to select the current environment
Use the selected environment config to set DAG parameters
Print the DAG id and schedule interval to confirm correct setup
💡 Why This Matters
🌍 Real World
In real projects, teams deploy the same Airflow workflows to multiple environments to test changes safely before production. This setup helps avoid mistakes and downtime.
💼 Career
Understanding multi-environment deployment is essential for DevOps engineers and data engineers to manage workflows reliably and safely across development, testing, and production.
Progress0 / 4 steps
1
Create environment configurations dictionary
Create a dictionary called env_configs with these exact entries: 'dev', 'staging', and 'prod'. Each key should map to another dictionary with keys 'schedule_interval' and 'retries' set as follows:
'dev': {'schedule_interval': '@daily', 'retries': 1}
'staging': {'schedule_interval': '@hourly', 'retries': 2}
'prod': {'schedule_interval': '@once', 'retries': 3}
Apache Airflow
Hint

Use a dictionary with keys 'dev', 'staging', and 'prod'. Each key maps to another dictionary with 'schedule_interval' and 'retries'.

2
Set current environment variable
Create a variable called current_env and set it to the string 'staging' to select the staging environment.
Apache Airflow
Hint

Assign the string 'staging' to the variable named current_env.

3
Create DAG using environment config
Import DAG and datetime from Airflow and Python respectively. Then create a DAG object called dag with dag_id set to 'example_dag', start_date set to datetime(2024, 1, 1), schedule_interval and default_args['retries'] set using the values from env_configs[current_env].
Apache Airflow
Hint

Use env_configs[current_env]['schedule_interval'] for schedule_interval and env_configs[current_env]['retries'] for retries in default_args.

4
Print DAG details to verify environment setup
Write two print statements: one to print the string 'DAG ID:' followed by dag.dag_id, and another to print 'Schedule Interval:' followed by dag.schedule_interval.
Apache Airflow
Hint

Use print('DAG ID:', dag.dag_id) and print('Schedule Interval:', dag.schedule_interval).