0
0
Apache Airflowdevops~30 mins

Sharing data between tasks effectively in Apache Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
Sharing data between tasks effectively
📖 Scenario: You are building a simple Airflow workflow to process daily sales data. The first task fetches sales numbers, and the second task calculates the total sales. You want to share the sales data between these tasks efficiently.
🎯 Goal: Build an Airflow DAG where one task pushes sales data, and the next task pulls that data to calculate the total sales.
📋 What You'll Learn
Create a Python dictionary called sales_data with exact entries: 'Monday': 100, 'Tuesday': 150, 'Wednesday': 200
Create a task called push_sales that pushes sales_data to XCom
Create a task called pull_and_sum_sales that pulls sales_data from XCom and calculates the total sales
Print the total sales in the pull_and_sum_sales task
💡 Why This Matters
🌍 Real World
Sharing data between tasks is common in workflows where one step produces data that the next step needs to use, like processing sales, logs, or user data.
💼 Career
Understanding XCom in Airflow is essential for building reliable data pipelines and workflows in many DevOps and data engineering roles.
Progress0 / 4 steps
1
Create the sales data dictionary
Create a Python dictionary called sales_data with these exact entries: 'Monday': 100, 'Tuesday': 150, 'Wednesday': 200
Apache Airflow
Need a hint?

Use curly braces {} to create a dictionary with keys and values.

2
Create the push_sales task to push data to XCom
Create a Python function called push_sales that takes **kwargs and pushes the sales_data dictionary to XCom using kwargs['ti'].xcom_push(key='sales', value=sales_data)
Apache Airflow
Need a hint?

Use kwargs['ti'].xcom_push to send data to the next task.

3
Create the pull_and_sum_sales task to pull data and calculate total
Create a Python function called pull_and_sum_sales that takes **kwargs, pulls sales_data from XCom using kwargs['ti'].xcom_pull(key='sales', task_ids='push_sales'), calculates the total sales by summing the values, and stores it in a variable called total_sales
Apache Airflow
Need a hint?

Use kwargs['ti'].xcom_pull to get data from the previous task, then sum the dictionary values.

4
Print the total sales in pull_and_sum_sales task
Add a print statement inside the pull_and_sum_sales function to display the text Total sales: {total_sales} using an f-string
Apache Airflow
Need a hint?

Use print(f"Total sales: {total_sales}") to show the result.