How to Use HttpHook in Airflow for HTTP Requests
Use Airflow's
HttpHook to send HTTP requests by creating an instance with the connection ID and calling methods like run() to perform GET or POST requests. This hook simplifies interacting with REST APIs inside Airflow tasks.Syntax
The HttpHook is initialized with a connection ID that references an HTTP connection configured in Airflow. You use the run() method to send HTTP requests by specifying the endpoint, HTTP method, and optional data or headers.
- http_conn_id: The Airflow connection ID for the HTTP service.
- run(endpoint, data=None, headers=None, extra_options=None, method='GET'): Sends the HTTP request.
python
from airflow.providers.http.hooks.http import HttpHook http = HttpHook(http_conn_id='my_http_connection') response = http.run(endpoint='api/v1/resource', method='GET')
Example
This example shows how to use HttpHook inside an Airflow task to make a GET request to a public API and print the JSON response.
python
from airflow import DAG from airflow.operators.python import PythonOperator from airflow.providers.http.hooks.http import HttpHook from datetime import datetime def fetch_data(): http = HttpHook(http_conn_id='httpbin') response = http.run(endpoint='get', method='GET') print(response.json()) default_args = { 'start_date': datetime(2024, 1, 1), } dag = DAG('http_hook_example', default_args=default_args, schedule_interval='@once') fetch_task = PythonOperator( task_id='fetch_http_data', python_callable=fetch_data, dag=dag )
Output
{'args': {}, 'headers': {...}, 'origin': 'x.x.x.x', 'url': 'https://httpbin.org/get'}
Common Pitfalls
- Not setting up the HTTP connection in Airflow UI or via environment variables causes connection errors.
- Using incorrect
http_conn_idor endpoint paths leads to failed requests. - For POST requests, forgetting to pass
dataor incorrect headers can cause unexpected responses.
Always verify your connection and endpoint before running tasks.
python
from airflow.providers.http.hooks.http import HttpHook # Wrong: Missing http_conn_id or wrong ID http = HttpHook(http_conn_id='wrong_id') response = http.run(endpoint='get') # This will fail # Right: Correct connection ID and method http = HttpHook(http_conn_id='httpbin') response = http.run(endpoint='get', method='GET')
Quick Reference
| Parameter | Description |
|---|---|
| http_conn_id | Airflow connection ID for HTTP service |
| endpoint | API endpoint path to call |
| method | HTTP method like GET, POST, PUT, DELETE |
| data | Payload for POST/PUT requests |
| headers | Optional HTTP headers dictionary |
| run() | Method to execute the HTTP request |
Key Takeaways
Initialize HttpHook with a valid Airflow HTTP connection ID.
Use the run() method to send HTTP requests specifying endpoint and method.
Ensure your Airflow HTTP connection is correctly configured before use.
For POST requests, provide data and headers as needed.
Check API responses by accessing the returned response object.