0
0
Apache Airflowdevops~20 mins

XCom size limitations and alternatives in Apache Airflow - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
XCom Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Why does Airflow limit XCom size?
Airflow has a default size limit for XCom data. Why is this limit important?
ATo stop tasks from running longer than their timeout
BTo prevent the metadata database from growing too large and slowing down the scheduler
CTo avoid tasks from failing due to memory overflow in worker nodes
DTo ensure XCom data is encrypted automatically
Attempts:
2 left
💡 Hint
Think about what happens if too much data is stored in the Airflow metadata database.
💻 Command Output
intermediate
1:30remaining
What happens when an XCom exceeds the size limit?
Given a task that tries to push an XCom value larger than the default limit, what is the expected behavior?
Apache Airflow
task_instance.xcom_push(key='large_data', value='x' * 5000000)  # 5MB string
AThe scheduler crashes due to database overload
BThe XCom is silently truncated to the allowed size and task succeeds
CThe task succeeds and the full data is stored without error
DThe task fails with an AirflowException about XCom size limit
Attempts:
2 left
💡 Hint
Consider what Airflow does to protect the metadata database from large data.
Best Practice
advanced
2:00remaining
Best alternative to large XCom data
If you need to share large data between tasks in Airflow, which approach is best to avoid XCom size limits?
AStore the data in an external storage like S3 or a database and pass the reference via XCom
BSplit the data into many small XComs under the size limit
CCompress the data and push it as a single XCom
DIncrease the Airflow XCom size limit in the configuration
Attempts:
2 left
💡 Hint
Think about scalable and reliable ways to share large data between tasks.
Troubleshoot
advanced
2:00remaining
Diagnosing XCom size limit errors
You see a task failing with an error mentioning XCom size limit. What is the most effective first step to fix this?
AModify the task to store large data externally and push only a reference in XCom
BRestart the Airflow scheduler to clear the error
CIncrease the worker memory to handle larger XComs
DDelete old XCom entries from the metadata database manually
Attempts:
2 left
💡 Hint
Think about the root cause of the error and how to avoid it.
🔀 Workflow
expert
2:30remaining
Design a workflow to handle large data passing between tasks
You have a DAG where Task A generates a 100MB file and Task B needs to process it. How should you design the data passing to avoid XCom size limits?
ATask A encodes the file content as base64 and pushes it via XCom; Task B decodes it
BTask A splits the file into 10MB chunks and pushes each chunk as separate XComs; Task B reassembles them
CTask A uploads the file to S3 and pushes the S3 path via XCom; Task B downloads from S3 using the path
DTask A pushes the file content directly via XCom; Task B reads it from XCom
Attempts:
2 left
💡 Hint
Consider scalability, reliability, and Airflow best practices for large data.