0
0
AirflowConceptBeginner · 3 min read

What is SqlSensor in Airflow: Explanation and Example

SqlSensor in Airflow is a sensor that waits for a specific SQL query to return a result before continuing a workflow. It repeatedly runs the query until it finds data or times out, helping coordinate tasks based on database state.
⚙️

How It Works

SqlSensor acts like a watchful friend who keeps checking a database until a certain condition is met. It runs a SQL query repeatedly at set intervals, waiting for the query to return at least one row of data.

Think of it as waiting for a package delivery: you keep checking the tracking status until it shows "delivered". Similarly, SqlSensor keeps checking the database until the data you expect appears, then it lets the workflow continue.

This helps Airflow workflows pause and resume based on real-time data availability, ensuring tasks run only when their required data is ready.

💻

Example

This example shows how to use SqlSensor to wait until a table has at least one row where the status is 'ready'.

python
from airflow import DAG
from airflow.providers.postgres.sensors.sql import SqlSensor
from airflow.utils.dates import days_ago

with DAG(dag_id='sqlsensor_example', start_date=days_ago(1), schedule_interval='@daily') as dag:
    wait_for_data = SqlSensor(
        task_id='wait_for_ready_data',
        conn_id='postgres_default',
        sql="SELECT 1 FROM my_table WHERE status = 'ready' LIMIT 1",
        poke_interval=30,  # check every 30 seconds
        timeout=600       # timeout after 10 minutes
    )
Output
The task 'wait_for_ready_data' will keep checking the SQL query every 30 seconds until it finds a row with status 'ready' or times out after 10 minutes.
🎯

When to Use

Use SqlSensor when you want your Airflow task to wait for data to appear or change in a database before moving on. This is common in data pipelines where downstream tasks depend on upstream data availability.

For example, if you have a task that processes new records in a database, you can use SqlSensor to pause until those records exist. It helps avoid errors from running tasks too early.

It is also useful for coordinating workflows that depend on external systems updating a database asynchronously.

Key Points

  • SqlSensor waits by running a SQL query repeatedly until it returns data.
  • It helps synchronize Airflow tasks with database state.
  • You can set how often it checks and how long it waits before timing out.
  • It supports many databases via Airflow connections.

Key Takeaways

SqlSensor waits for a SQL query to return data before continuing a task.
It is useful to pause workflows until required data is available in a database.
You can configure check frequency and timeout to control waiting behavior.
It supports multiple databases through Airflow's connection system.