0
0
AirflowComparisonBeginner · 4 min read

Poke vs Reschedule Sensor Mode in Airflow: Key Differences and Usage

In Airflow, poke mode continuously checks the sensor condition by holding a worker slot, while reschedule mode frees the worker slot between checks by rescheduling the task. Reschedule mode is more resource-efficient for long waits, whereas poke mode is simpler and faster for short waits.
⚖️

Quick Comparison

This table summarizes the main differences between poke and reschedule sensor modes in Airflow.

FactorPoke ModeReschedule Mode
Worker Slot UsageHolds slot during entire sensor executionReleases slot between checks
Resource EfficiencyLess efficient for long waitsMore efficient for long waits
Check FrequencyContinuous pollingPolling with task rescheduling
Task StateTask stays runningTask goes to scheduled state between checks
Use CaseShort wait times or fast checksLong wait times or slow checks
ComplexitySimpler implementationSlightly more complex due to rescheduling
⚖️

Key Differences

The poke mode in Airflow sensors keeps the task running and continuously checks the condition at fixed intervals. This means the worker slot is occupied the entire time the sensor is active, which can lead to inefficient resource use if the wait time is long.

In contrast, the reschedule mode frees the worker slot after each check if the condition is not met. The sensor task is then rescheduled to run again later. This approach allows Airflow to use worker slots more efficiently, especially for sensors that wait for long periods.

While poke mode is simpler and may be preferred for short or quick checks, reschedule mode is better suited for long-running sensors to avoid blocking workers unnecessarily. However, reschedule mode introduces some overhead due to task rescheduling and state changes.

⚖️

Code Comparison

Here is an example of a sensor using poke mode that checks for a file's existence every 10 seconds.

python
from airflow.sensors.filesystem import FileSensor
from airflow import DAG
from datetime import datetime

default_args = {
    'start_date': datetime(2024, 1, 1),
    'retries': 0
}

dag = DAG('poke_mode_example', default_args=default_args, schedule_interval='@daily')

file_sensor = FileSensor(
    task_id='poke_file_sensor',
    filepath='/tmp/myfile.txt',
    poke_interval=10,  # check every 10 seconds
    mode='poke',      # sensor mode
    dag=dag
)
Output
The sensor task runs continuously, holding a worker slot and checking every 10 seconds until the file exists.
↔️

Reschedule Mode Equivalent

This example shows the same file sensor using reschedule mode to free the worker slot between checks.

python
from airflow.sensors.filesystem import FileSensor
from airflow import DAG
from datetime import datetime

default_args = {
    'start_date': datetime(2024, 1, 1),
    'retries': 0
}

dag = DAG('reschedule_mode_example', default_args=default_args, schedule_interval='@daily')

file_sensor = FileSensor(
    task_id='reschedule_file_sensor',
    filepath='/tmp/myfile.txt',
    poke_interval=10,  # check every 10 seconds
    mode='reschedule', # sensor mode
    dag=dag
)
Output
The sensor task releases the worker slot after each check and is rescheduled to run again in 10 seconds until the file exists.
🎯

When to Use Which

Choose poke mode when your sensor checks are quick and you expect the condition to be met soon, as it keeps the task simple and responsive.

Choose reschedule mode for sensors that wait for long periods or have slow checks, to avoid blocking worker slots and improve resource efficiency.

In general, reschedule mode is recommended for production environments with many sensors to optimize worker usage.

Key Takeaways

Poke mode holds a worker slot continuously, suitable for short waits.
Reschedule mode frees the worker slot between checks, ideal for long waits.
Use poke mode for fast, frequent checks to keep things simple.
Use reschedule mode to save resources during long sensor waits.
Reschedule mode is preferred in production for better worker slot management.