0
0
Apache Airflowdevops~30 mins

Connection management for cloud services in Apache Airflow - Mini Project: Build & Apply

Choose your learning style9 modes available
Connection management for cloud services
📖 Scenario: You are setting up Apache Airflow to connect to a cloud storage service. To do this, you need to create and manage a connection in Airflow that stores the credentials and endpoint details securely. This project will guide you through creating a connection, configuring it, using it in a DAG, and finally printing the connection details.
🎯 Goal: Build an Airflow setup where you create a cloud service connection, configure it with necessary details, use it in a DAG to retrieve connection info, and print the connection details.
📋 What You'll Learn
Create an Airflow connection with exact ID and parameters
Add configuration variables for the connection
Write a DAG that uses the connection to get connection info
Print the connection details in the DAG task
💡 Why This Matters
🌍 Real World
Managing connections in Airflow is essential for securely accessing cloud services like storage, databases, or APIs without hardcoding credentials.
💼 Career
DevOps and Data Engineering roles often require setting up and managing Airflow connections to automate workflows that interact with cloud platforms.
Progress0 / 4 steps
1
Create an Airflow connection for cloud storage
Create an Airflow connection with the connection ID cloud_storage_conn. Set the connection type to Google Cloud, host to https://storage.googleapis.com, and login to user123. Use the password pass123.
Apache Airflow
Need a hint?

Use the Connection class from airflow.models to create the connection object with the exact parameters.

2
Add configuration variables for the connection
Create a variable called project_id and set it to my-gcp-project. Also create a variable called location and set it to us-central1.
Apache Airflow
Need a hint?

Simply assign the exact string values to the variables project_id and location.

3
Write a DAG that uses the connection to get connection info
Write a DAG named cloud_storage_dag with a single PythonOperator task called print_conn_info. In the task's Python function, use Airflow's BaseHook.get_connection to get the connection with ID cloud_storage_conn. Extract the host, login, and password from the connection object and store them in variables named conn_host, conn_login, and conn_password respectively.
Apache Airflow
Need a hint?

Define a Python function that calls BaseHook.get_connection('cloud_storage_conn') and extracts the host, login, and password. Then create a DAG and a PythonOperator that runs this function.

4
Print the connection details in the DAG task
In the print_connection_info function, add print statements to display the connection host, login, and password. The output should be exactly:
Host: https://storage.googleapis.com
Login: user123
Password: pass123
Apache Airflow
Need a hint?

Use print(f"Host: {conn_host}") and similar print statements for login and password inside the function.