Recall & Review
beginner
What is the default size limit for an XCom value in Airflow?
The default size limit for an XCom value is about 48 KB because XComs are stored in the Airflow metadata database, which is not designed for large data.
Click to reveal answer
beginner
Why should you avoid storing large data directly in XComs?
Storing large data in XComs can slow down the Airflow database, cause performance issues, and may lead to failures due to size limits.
Click to reveal answer
beginner
Name one common alternative to storing large data instead of using XComs.
A common alternative is to store large data in external storage like Amazon S3, Google Cloud Storage, or a database, and pass only the reference or path via XCom.
Click to reveal answer
intermediate
How can you pass data between tasks without hitting XCom size limits?
You can save the data to a file or cloud storage in one task, then pass the file path or URL via XCom to the next task, which reads the data from there.
Click to reveal answer
advanced
What is the role of XCom backends in managing XCom size limitations?
Custom XCom backends can be implemented to store XCom data outside the metadata database, such as in files or object storage, helping to handle larger data sizes.
Click to reveal answer
What happens if you try to store very large data directly in an XCom?
✗ Incorrect
XComs are stored in the Airflow metadata database, which has size limits. Large data can cause performance problems or errors.
Which of the following is a good practice to handle large data between Airflow tasks?
✗ Incorrect
Storing large data externally and passing only the reference via XCom avoids size limits and improves performance.
What is an XCom backend in Airflow?
✗ Incorrect
XCom backends allow customizing storage of XCom data, for example, storing large data outside the metadata database.
What type of data is best suited for XComs?
✗ Incorrect
XComs are designed for small pieces of data like metadata or references, not large files or datasets.
If you want to pass a large JSON object between tasks, what should you do?
✗ Incorrect
Large JSON objects should be stored externally and only the reference passed via XCom to avoid size limits.
Explain why XComs have size limitations and how you can work around them in Airflow.
Think about where XCom data is stored and what happens with big data.
You got /4 concepts.
Describe how to implement an alternative to large XCom data passing using cloud storage.
Consider splitting data storage and data reference.
You got /4 concepts.