0
0
Apache Airflowdevops~10 mins

TaskFlow API for cleaner XCom in Apache Airflow - Step-by-Step Execution

Choose your learning style9 modes available
Process Flow - TaskFlow API for cleaner XCom
Define Python function task
Decorate with @task
Use function in DAG context
Return value auto-pushed to XCom
Downstream task receives value as argument
No manual XCom push/pull needed
DAG runs cleanly
The TaskFlow API lets you write Python functions as tasks that automatically pass data via XCom without manual push or pull.
Execution Sample
Apache Airflow
from airflow.decorators import dag, task
from datetime import datetime

@dag(start_date=datetime(2024,1,1), schedule_interval='@daily', catchup=False)
def my_dag():
    @task
    def get_data():
        return 'hello'

    @task
    def print_data(data):
        print(data)

    print_data(get_data())

dag = my_dag()
This DAG defines two tasks using TaskFlow API; get_data returns a string, print_data prints it, passing data automatically via XCom.
Process Table
StepActionTaskXCom PushXCom PullOutput/Result
1Run get_data taskget_data'hello' pushed automaticallyNoneReturns 'hello'
2Run print_data task with argumentprint_dataNone'hello' pulled automaticallyPrints 'hello'
3DAG completesallAll XComs handled by TaskFlowAll XComs handled by TaskFlowClean DAG run with no manual XCom code
💡 All tasks complete; data passed automatically via TaskFlow API without manual XCom push/pull.
Status Tracker
VariableStartAfter get_dataAfter print_dataFinal
dataNone'hello' returned by get_data'hello' received by print_data'hello' printed
Key Moments - 2 Insights
Why don't we see any manual XCom push or pull commands in the code?
Because the TaskFlow API automatically pushes the return value of a @task function to XCom and pulls it when passed as an argument, as shown in execution_table steps 1 and 2.
How does print_data get the value from get_data without explicit XCom calls?
The TaskFlow API handles this by automatically pulling the XCom value returned by get_data when print_data is called with get_data() as argument, as seen in execution_table step 2.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what does the get_data task push to XCom?
ANone
B'hello'
Cprint_data function
DA datetime object
💡 Hint
Check the 'XCom Push' column in row 1 of the execution_table.
At which step does print_data pull data from XCom automatically?
AStep 2
BStep 1
CStep 3
DNever
💡 Hint
Look at the 'XCom Pull' column in the execution_table for step 2.
If we changed get_data to return a number instead of 'hello', what would print_data receive?
AThe string 'hello'
BNone
CThe number returned by get_data
DAn error
💡 Hint
Refer to variable_tracker showing how data flows from get_data to print_data.
Concept Snapshot
TaskFlow API uses @task decorator to turn Python functions into Airflow tasks.
Return values are automatically pushed to XCom.
Passing a task's output as argument to another task auto-pulls XCom.
No manual XCom push/pull needed.
Simplifies data passing and makes DAG code cleaner.
Full Transcript
The TaskFlow API in Airflow lets you write Python functions as tasks using the @task decorator. When a task function returns a value, Airflow automatically pushes it to XCom. When another task function takes that task's output as an argument, Airflow automatically pulls the value from XCom. This removes the need for manual XCom push and pull commands. In the example DAG, get_data returns 'hello' which is automatically pushed to XCom. The print_data task receives this value as an argument, automatically pulling it from XCom, and prints it. This makes DAGs cleaner and easier to read.