0
0
Apache Airflowdevops~30 mins

Mapped tasks for parallel processing in Apache Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
Mapped tasks for parallel processing in Airflow
📖 Scenario: You are working with Apache Airflow to automate data processing tasks. You want to run the same task on multiple data inputs at the same time to save time.This is like having a kitchen where you want to bake many cookies at once instead of one by one.
🎯 Goal: Build an Airflow DAG that uses mapped tasks to process a list of numbers in parallel. Each task will square a number from the list.
📋 What You'll Learn
Create a list of numbers to process
Define a simple Python function to square a number
Use Airflow's task mapping to run the function on each number in the list
Print the results of each squared number
💡 Why This Matters
🌍 Real World
Mapped tasks let you run many similar jobs at the same time, like processing many files or data chunks quickly.
💼 Career
Knowing how to use mapped tasks in Airflow is useful for data engineers and DevOps professionals automating workflows efficiently.
Progress0 / 4 steps
1
Create the list of numbers to process
Create a list called numbers with these exact values: [1, 2, 3, 4, 5]
Apache Airflow
Need a hint?

Use square brackets to create a list and separate numbers with commas.

2
Define a Python function to square a number
Define a function called square_number that takes one argument n and returns n * n
Apache Airflow
Need a hint?

Use def to define a function and return to send back the result.

3
Create an Airflow DAG with mapped tasks
Import DAG and task from airflow.decorators. Create a DAG called square_numbers_dag with start_date as 2023-01-01. Use the @task decorator on square_number. Use square_number.expand(n=numbers) to map the task over the list numbers.
Apache Airflow
Need a hint?

Use with DAG(...) to create the DAG context. Decorate the function with @task. Use .expand() to map the task.

4
Print the results of the mapped tasks
Add a task called print_results decorated with @task that takes results as input and prints it. Call print_results(squared_results) inside the DAG. Run the DAG and write the output you see from print_results.
Apache Airflow
Need a hint?

The print output will show the list of squared numbers: [1, 4, 9, 16, 25]