0
0
Apache Airflowdevops~30 mins

DAG performance tracking in Apache Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
DAG Performance Tracking in Airflow
📖 Scenario: You are managing workflows using Apache Airflow. You want to track the performance of your DAG runs by recording the duration of each run.This helps you understand how long your workflows take and identify any delays.
🎯 Goal: Build a simple Airflow DAG that tracks the duration of each DAG run and stores this information in a Python dictionary.You will create the data structure, add a configuration variable for minimum duration threshold, calculate durations, and finally print the durations.
📋 What You'll Learn
Create a dictionary to store DAG run durations
Add a minimum duration threshold variable
Calculate the duration of each DAG run using start and end times
Print the dictionary with DAG run durations
💡 Why This Matters
🌍 Real World
Tracking DAG run durations helps teams monitor workflow performance and detect delays or failures early.
💼 Career
Understanding how to track and analyze DAG performance is important for DevOps engineers and data engineers working with workflow automation.
Progress0 / 4 steps
1
Create a dictionary to store DAG run durations
Create a dictionary called dag_run_durations with these exact entries: 'run_1': 0, 'run_2': 0, 'run_3': 0.
Apache Airflow
Hint

Use curly braces to create a dictionary with keys and values.

2
Add a minimum duration threshold variable
Add a variable called min_duration_threshold and set it to 5 (minutes).
Apache Airflow
Hint

Just assign the number 5 to the variable min_duration_threshold.

3
Calculate the duration of each DAG run
Given these start and end times in minutes for each run: start_times = {'run_1': 0, 'run_2': 10, 'run_3': 20} and end_times = {'run_1': 7, 'run_2': 15, 'run_3': 30}, use a for loop with variables run and start to iterate over start_times.items(). Calculate the duration as end_times[run] - start and update dag_run_durations[run] with this value.
Apache Airflow
Hint

Use a for loop to go through start_times and calculate duration using end_times.

4
Print the dictionary with DAG run durations
Write a print statement to display the dag_run_durations dictionary.
Apache Airflow
Hint

Use print(dag_run_durations) to show the durations.