0
0
Apache Airflowdevops~5 mins

SimpleHttpOperator for API calls in Apache Airflow - Commands & Configuration

Choose your learning style9 modes available
Introduction
Sometimes you need to get data from or send data to a web service automatically. SimpleHttpOperator in Airflow helps you do this by making HTTP requests as part of your workflow.
When you want to fetch data from a public API every hour to update your database.
When you need to send a notification to a web service after a task completes.
When you want to check the status of a web service regularly and act on the response.
When you want to trigger an external system by calling its API as part of your workflow.
When you want to automate data collection from a REST API without writing complex code.
Config File - example_dag.py
example_dag.py
from airflow import DAG
from airflow.providers.http.operators.http import SimpleHttpOperator
from airflow.utils.dates import days_ago

with DAG(
    dag_id='http_api_call_example',
    schedule_interval='@hourly',
    start_date=days_ago(1),
    catchup=False
) as dag:

    call_api = SimpleHttpOperator(
        task_id='call_example_api',
        http_conn_id='example_api',
        endpoint='data',
        method='GET',
        response_check=lambda response: response.status_code == 200,
        log_response=True
    )

This DAG defines a workflow that runs every hour starting from yesterday.

SimpleHttpOperator is used to make a GET request to the 'data' endpoint of the HTTP connection named 'example_api'.

The response_check ensures the response status code is 200 (OK), and log_response will print the response content in the logs.

Commands
This command creates an Airflow connection named 'example_api' pointing to the base URL of the API you want to call. It lets SimpleHttpOperator know where to send requests.
Terminal
airflow connections add example_api --conn-type http --conn-host https://api.example.com
Expected OutputExpected
Connection `example_api` added successfully
--conn-type - Specifies the connection type, here 'http' for HTTP API
--conn-host - Sets the base URL for the API
This command lists all available DAGs to confirm your DAG file is recognized by Airflow.
Terminal
airflow dags list
Expected OutputExpected
dag_id | owner | paused | last_run http_api_call_example | airflow | False | None
This command manually triggers the DAG to run immediately, so you can test the API call task.
Terminal
airflow dags trigger http_api_call_example
Expected OutputExpected
Created <DagRun http_api_call_example @ 2024-06-01T12:00:00+00:00: manual__2024-06-01T12:00:00+00:00, externally triggered: True>
This command shows the logs of the API call task to verify the request was sent and the response received.
Terminal
airflow tasks logs http_api_call_example call_example_api
Expected OutputExpected
[2024-06-01 12:00:01,000] {http.py:123} INFO - Making GET request to https://api.example.com/data [2024-06-01 12:00:02,000] {http.py:145} INFO - Response status: 200 [2024-06-01 12:00:02,000] {http.py:146} INFO - Response content: {"key": "value"} [2024-06-01 12:00:02,000] {taskinstance.py:123} INFO - Task succeeded
Key Concept

If you remember nothing else from this pattern, remember: SimpleHttpOperator lets you easily automate HTTP API calls inside Airflow workflows without extra coding.

Common Mistakes
Not creating the HTTP connection in Airflow before using SimpleHttpOperator
The operator needs a connection ID to know the API base URL; without it, the task will fail.
Use 'airflow connections add' to create the HTTP connection with the correct host before running the DAG.
Using the wrong HTTP method or endpoint in the operator
The API call will fail or return errors if the method or endpoint is incorrect.
Check the API documentation and set the 'method' and 'endpoint' parameters correctly in SimpleHttpOperator.
Not checking the response status or content
You might miss failed API calls or errors, causing silent failures in your workflow.
Use the 'response_check' parameter to verify the response status and handle errors properly.
Summary
Create an HTTP connection in Airflow with the API base URL using 'airflow connections add'.
Define a DAG with SimpleHttpOperator to make the API call using the connection and endpoint.
Trigger the DAG and check task logs to verify the API call and response.