0
0
AirflowConceptBeginner · 3 min read

HttpSensor in Airflow: What It Is and How It Works

HttpSensor in Airflow is a sensor that waits for a specific HTTP endpoint to return a desired response before continuing a workflow. It periodically sends HTTP requests and checks the response to decide if the task should proceed.
⚙️

How It Works

HttpSensor acts like a watchful friend who keeps checking a website or API until it sees the answer it expects. It sends HTTP requests at set intervals and looks at the response to decide if the condition is met.

Imagine you want to start a task only after a web service is ready. The sensor keeps asking the service, "Are you ready yet?" and waits patiently until it gets a positive answer. This way, your workflow doesn’t start too early and fail.

💻

Example

This example shows how to use HttpSensor to wait for a website to return a 200 status code before continuing.

python
from airflow import DAG
from airflow.providers.http.sensors.http import HttpSensor
from airflow.utils.dates import days_ago

with DAG(dag_id='http_sensor_example', start_date=days_ago(1), schedule_interval='@daily') as dag:
    wait_for_website = HttpSensor(
        task_id='wait_for_website',
        http_conn_id='http_default',
        endpoint='api/health',
        response_check=lambda response: response.status_code == 200,
        poke_interval=10,
        timeout=60
    )
Output
The task 'wait_for_website' will keep sending GET requests to 'http://<http_conn_id_base_url>/api/health' every 10 seconds until it receives a 200 status code or times out after 60 seconds.
🎯

When to Use

Use HttpSensor when your workflow depends on an external web service or API being ready or returning specific data. It is useful for waiting on REST APIs, webhooks, or any HTTP endpoint before starting downstream tasks.

For example, you might wait for a data provider's API to confirm data availability or for a web application to finish deployment before running tests.

Key Points

  • HttpSensor polls an HTTP endpoint repeatedly until a condition is met.
  • It helps coordinate workflows that depend on external web services.
  • You can customize the check by providing a function to inspect the HTTP response.
  • It uses Airflow connections to manage the base URL and authentication.

Key Takeaways

HttpSensor waits for an HTTP endpoint to meet a condition before proceeding.
It sends repeated HTTP requests at intervals until success or timeout.
Use it to ensure external web services are ready before running dependent tasks.
You can customize response checks with a function.
It integrates with Airflow connections for easy URL and auth management.