Challenge - 5 Problems
XCom Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate1:30remaining
Why does Airflow limit XCom size?
Airflow has a default size limit for XCom data. Why is this limit important?
Attempts:
2 left
💡 Hint
Think about what happens if too much data is stored in the Airflow metadata database.
✗ Incorrect
Airflow stores XCom data in its metadata database. Large XComs can cause the database to grow quickly, which slows down the scheduler and affects performance.
💻 Command Output
intermediate1:30remaining
What happens when an XCom exceeds the size limit?
Given a task that tries to push an XCom value larger than the default limit, what is the expected behavior?
Apache Airflow
task_instance.xcom_push(key='large_data', value='x' * 5000000) # 5MB string
Attempts:
2 left
💡 Hint
Consider what Airflow does to protect the metadata database from large data.
✗ Incorrect
When an XCom value exceeds the size limit, Airflow raises an AirflowException to prevent storing too large data.
✅ Best Practice
advanced2:00remaining
Best alternative to large XCom data
If you need to share large data between tasks in Airflow, which approach is best to avoid XCom size limits?
Attempts:
2 left
💡 Hint
Think about scalable and reliable ways to share large data between tasks.
✗ Incorrect
Storing large data externally and passing only a reference in XCom keeps the metadata database small and avoids size limits.
❓ Troubleshoot
advanced2:00remaining
Diagnosing XCom size limit errors
You see a task failing with an error mentioning XCom size limit. What is the most effective first step to fix this?
Attempts:
2 left
💡 Hint
Think about the root cause of the error and how to avoid it.
✗ Incorrect
The error occurs because the task tries to push too large data. Changing the task to store data externally and push a reference fixes the root cause.
🔀 Workflow
expert2:30remaining
Design a workflow to handle large data passing between tasks
You have a DAG where Task A generates a 100MB file and Task B needs to process it. How should you design the data passing to avoid XCom size limits?
Attempts:
2 left
💡 Hint
Consider scalability, reliability, and Airflow best practices for large data.
✗ Incorrect
Uploading large files to external storage and passing only the path via XCom is the recommended approach to handle large data efficiently.