Limitations of XCom in Airflow: What You Need to Know
XCom is limited by a small size limit (usually 48KB) for data it can pass between tasks, and it requires data to be serializable (picklable). Large or complex data should be stored externally, as XCom is not designed for heavy data transfer or long-term storage.Syntax
The XCom feature in Airflow allows tasks to share small pieces of data using the xcom_push() and xcom_pull() methods.
task_instance.xcom_push(key, value): Sends data with a key.task_instance.xcom_pull(task_ids, key): Retrieves data by key from a specific task.
This data is stored in Airflow's metadata database and must be serializable.
def push_function(ti): ti.xcom_push(key='sample_key', value='small_data') def pull_function(ti): data = ti.xcom_pull(task_ids='push_task', key='sample_key') print(f"Pulled data: {data}")
Example
This example shows how to push and pull a small string using XCom in Airflow tasks.
from airflow import DAG from airflow.operators.python import PythonOperator from datetime import datetime def push_function(ti): ti.xcom_push(key='message', value='Hello from XCom!') def pull_function(ti): message = ti.xcom_pull(task_ids='push_task', key='message') print(f"Pulled message: {message}") with DAG('xcom_limitations_example', start_date=datetime(2024, 1, 1), schedule_interval=None, catchup=False) as dag: push_task = PythonOperator( task_id='push_task', python_callable=push_function ) pull_task = PythonOperator( task_id='pull_task', python_callable=pull_function ) push_task >> pull_task
Common Pitfalls
1. Size Limit: XCom data is limited to about 48KB by default because it is stored in the Airflow metadata database. Trying to push large data causes errors or truncation.
2. Serialization Issues: Data must be serializable with pickle. Complex objects or open file handles cannot be passed.
3. Not for Long-Term Storage: XCom is meant for short-lived data sharing between tasks, not for persistent storage.
4. Performance Impact: Excessive use of XCom with large data can slow down the scheduler and database.
def wrong_push(ti): large_data = 'x' * 100000 # 100KB string, too large ti.xcom_push(key='large', value=large_data) # This will fail or truncate def right_push(ti): # Instead, store large data externally and push a reference ti.xcom_push(key='file_path', value='/tmp/large_data.txt')
Quick Reference
XCom Limitations Summary:
- Max size ~48KB per XCom entry.
- Data must be picklable (serializable).
- Not suitable for large or binary data.
- Use external storage (S3, DB, files) for big data.
- Use XCom for small metadata or signals only.