0
0
Apache Airflowdevops~5 mins

Why XCom enables task communication in Apache Airflow - Why It Works

Choose your learning style9 modes available
Introduction
In Airflow, tasks often need to share information to work together smoothly. XCom is a tool that helps tasks send small pieces of data to each other, making it easier to coordinate complex workflows.
When one task produces data that another task needs to use later in the workflow
When you want to pass the result of a database query from one task to another without saving it externally
When you need to share a status or flag between tasks to control workflow logic
When you want to avoid writing temporary files for passing small data between tasks
When you want to keep your workflow clean and simple by using built-in Airflow features for communication
Commands
This command starts the example DAG that demonstrates how tasks use XCom to share data.
Terminal
airflow dags trigger example_xcom_dag
Expected OutputExpected
Created <DagRun example_xcom_dag @ 2024-06-01T12:00:00+00:00: manual__2024-06-01T12:00:00+00:00, externally triggered: True>
This lists all tasks in the example DAG so you can see which tasks will run and communicate using XCom.
Terminal
airflow tasks list example_xcom_dag
Expected OutputExpected
task_1 task_2
Runs task_1 alone to see how it pushes data to XCom without running the whole DAG.
Terminal
airflow tasks test example_xcom_dag task_1 2024-06-01
Expected OutputExpected
[2024-06-01 12:00:00,000] {taskinstance.py:1234} INFO - Pushing data to XCom: {'message': 'Hello from task_1'} [2024-06-01 12:00:00,100] {taskinstance.py:5678} INFO - Task succeeded
Runs task_2 alone to see how it pulls data from XCom that task_1 pushed earlier.
Terminal
airflow tasks test example_xcom_dag task_2 2024-06-01
Expected OutputExpected
[2024-06-01 12:01:00,000] {taskinstance.py:1234} INFO - Pulled data from XCom: {'message': 'Hello from task_1'} [2024-06-01 12:01:00,100] {taskinstance.py:5678} INFO - Task succeeded
Key Concept

If you remember nothing else, remember: XCom is Airflow's built-in way for tasks to send and receive small pieces of data to work together.

Common Mistakes
Trying to pass large files or big data through XCom
XCom is designed for small messages; large data can cause performance issues or failures
Use external storage like S3 or a database for large data and pass references via XCom
Not specifying the correct task IDs when pulling data from XCom
If the task ID is wrong, the data won't be found and the task will fail or get None
Always use the exact task ID that pushed the XCom data when pulling it
Expecting XCom data to persist indefinitely
XCom data is stored in Airflow's metadata database and can be cleaned up or overwritten
Use XCom only for short-term communication within the same DAG run
Summary
XCom lets Airflow tasks share small pieces of data easily within a workflow.
Use 'push' in one task to send data and 'pull' in another to receive it.
Avoid using XCom for large data or long-term storage; it's for quick task communication.