What is SqlSensor in Airflow: Explanation and Example
SqlSensor in Airflow is a sensor that waits for a specific SQL query to return a result before continuing a workflow. It repeatedly runs the query until it finds data or times out, helping coordinate tasks based on database state.How It Works
SqlSensor acts like a watchful friend who keeps checking a database until a certain condition is met. It runs a SQL query repeatedly at set intervals, waiting for the query to return at least one row of data.
Think of it as waiting for a package delivery: you keep checking the tracking status until it shows "delivered". Similarly, SqlSensor keeps checking the database until the data you expect appears, then it lets the workflow continue.
This helps Airflow workflows pause and resume based on real-time data availability, ensuring tasks run only when their required data is ready.
Example
This example shows how to use SqlSensor to wait until a table has at least one row where the status is 'ready'.
from airflow import DAG from airflow.providers.postgres.sensors.sql import SqlSensor from airflow.utils.dates import days_ago with DAG(dag_id='sqlsensor_example', start_date=days_ago(1), schedule_interval='@daily') as dag: wait_for_data = SqlSensor( task_id='wait_for_ready_data', conn_id='postgres_default', sql="SELECT 1 FROM my_table WHERE status = 'ready' LIMIT 1", poke_interval=30, # check every 30 seconds timeout=600 # timeout after 10 minutes )
When to Use
Use SqlSensor when you want your Airflow task to wait for data to appear or change in a database before moving on. This is common in data pipelines where downstream tasks depend on upstream data availability.
For example, if you have a task that processes new records in a database, you can use SqlSensor to pause until those records exist. It helps avoid errors from running tasks too early.
It is also useful for coordinating workflows that depend on external systems updating a database asynchronously.
Key Points
SqlSensorwaits by running a SQL query repeatedly until it returns data.- It helps synchronize Airflow tasks with database state.
- You can set how often it checks and how long it waits before timing out.
- It supports many databases via Airflow connections.
Key Takeaways
SqlSensor waits for a SQL query to return data before continuing a task.