In Apache Airflow, tasks often need to share data. Which of the following best explains how XCom enables this communication?
Think about where Airflow keeps task metadata and how tasks can access shared information.
XCom stores small pieces of data in Airflow's metadata database. Tasks can push data to XCom and other tasks can pull it later, enabling asynchronous communication without relying on files or environment variables.
Consider two Airflow tasks where task A pushes a value to XCom and task B pulls it. What will be printed by task B?
from airflow import DAG from airflow.operators.python import PythonOperator from datetime import datetime def push_function(ti): ti.xcom_push(key='sample_key', value='hello world') def pull_function(ti): value = ti.xcom_pull(key='sample_key', task_ids='push_task') print(value) dag = DAG('xcom_example', start_date=datetime(2023, 1, 1)) push_task = PythonOperator(task_id='push_task', python_callable=push_function, dag=dag) pull_task = PythonOperator(task_id='pull_task', python_callable=pull_function, dag=dag) push_task >> pull_task
Remember that task B pulls the value pushed by task A using the same key and task ID.
Task A pushes 'hello world' with key 'sample_key'. Task B pulls this exact key from task A and prints it, so the output is 'hello world'.
A developer notices that their task pulling from XCom always gets None. The push task runs before the pull task. What is the most likely cause?
def pull_function(ti): value = ti.xcom_pull(key='data_key', task_ids='push_task') print(value) # The push task uses ti.xcom_push(key='data_key', value='data') correctly.
Check if the push task actually completed and pushed data.
If the push task did not run successfully or did not push any value, the pull task will get None. Even if keys match and order is correct, no data is available to pull.
In a complex DAG, tasks need to exchange small pieces of data without direct connections. Which Airflow feature is designed for this purpose?
Think about a feature that allows tasks to exchange data during DAG execution.
XCom is the Airflow feature that enables asynchronous data exchange between tasks by storing small messages in the metadata database. Variables and Connections serve different purposes.
When using XCom to share data between tasks, which practice helps avoid common issues?
Consider how XCom stores data and what limitations it has.
XCom stores data in the metadata database, so keeping data small and JSON-serializable avoids performance issues and errors. Large files should be stored externally. Environment variables are not suitable for dynamic task data sharing.