0
0
Apache Airflowdevops~20 mins

Mapped tasks for parallel processing in Apache Airflow - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Mapped Tasks Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
💻 Command Output
intermediate
2:00remaining
Output of mapped task execution in Airflow
Given the following Airflow DAG snippet using task mapping, what will be the output logs of the mapped task?
Apache Airflow
from airflow import DAG
from airflow.decorators import task
from datetime import datetime

with DAG('mapped_task_example', start_date=datetime(2023, 1, 1), schedule_interval='@daily', catchup=False) as dag:
    @task
    def multiply_by_two(number):
        return number * 2

    numbers = [1, 2, 3]
    results = multiply_by_two.expand(number=numbers)
ATask multiply_by_two fails due to unsupported input type
BTask multiply_by_two runs once with input [1, 2, 3] and outputs [2, 4, 6]
CTask multiply_by_two runs 3 times but outputs are all 0
DTask multiply_by_two runs 3 times with inputs 1, 2, 3 and outputs 2, 4, 6 respectively
Attempts:
2 left
💡 Hint
Mapped tasks run once per item in the input list, producing one output per run.
🧠 Conceptual
intermediate
1:30remaining
Understanding task mapping behavior in Airflow
Which statement correctly describes how Airflow handles mapped tasks when the input list is empty?
AThe mapped task does not run at all
BThe mapped task runs infinitely until stopped
CThe mapped task runs once with a None input
DThe mapped task raises a runtime error
Attempts:
2 left
💡 Hint
Think about what happens if there is nothing to process.
Troubleshoot
advanced
2:30remaining
Troubleshooting mapped task failure due to XCom size limit
You have a mapped task that returns a very large list as output, but the task fails with an XCom size limit error. What is the best way to fix this issue?
ASplit the output into smaller chunks and map over those chunks instead
BIncrease the XCom size limit in Airflow configuration
CDisable XCom for the task
DReturn the large list as a string instead of a list
Attempts:
2 left
💡 Hint
Think about how to reduce the size of data passed between tasks.
🔀 Workflow
advanced
2:30remaining
Designing a workflow with mapped tasks and dependencies
You want to create an Airflow DAG where a mapped task processes a list of files, and after all mapped tasks finish, a final task aggregates the results. Which approach correctly ensures the final task runs only after all mapped tasks complete?
ASet the final task downstream of the mapped task using final_task >> mapped_task
BSet the final task downstream of the mapped task using final_task.set_downstream(mapped_task)
CSet the final task downstream of the mapped task using mapped_task >> final_task
DSet the final task downstream of the mapped task using final_task.set_upstream(mapped_task)
Attempts:
2 left
💡 Hint
Remember the direction of dependencies in Airflow DAGs.
Best Practice
expert
3:00remaining
Best practice for handling dynamic task mapping with large input lists
When using Airflow's mapped tasks with a very large input list (thousands of items), what is the best practice to avoid scheduler overload and improve performance?
AMap over the entire large list at once to maximize parallelism
BUse batch processing by splitting the input list into smaller chunks and map over each chunk sequentially
CDisable task retries to reduce scheduler load
DIncrease the number of scheduler workers without changing the DAG
Attempts:
2 left
💡 Hint
Think about balancing parallelism and system limits.