0
0
Apache Airflowdevops~5 mins

Connection management for cloud services in Apache Airflow - Commands & Configuration

Choose your learning style9 modes available
Introduction
Managing connections in Airflow helps you securely store and reuse credentials and settings needed to access cloud services. This avoids hardcoding sensitive information in your workflows and makes your automation safer and easier to maintain.
When you need to connect Airflow tasks to AWS services like S3 or Redshift without exposing keys in code
When you want to reuse the same cloud credentials across multiple workflows or DAGs
When you want to update cloud service credentials in one place without changing every workflow
When you want to keep your cloud connection details encrypted and secure within Airflow
When you want to switch between different cloud accounts or environments easily
Commands
This command adds a new Airflow connection named 'my_aws_conn' for AWS. It stores your AWS access key and secret key securely, along with the region information, so your tasks can use it to access AWS services.
Terminal
airflow connections add my_aws_conn --conn-type aws --conn-login AWS_ACCESS_KEY_ID --conn-password AWS_SECRET_ACCESS_KEY --conn-extra '{"region_name": "us-east-1"}'
Expected OutputExpected
Added connection `my_aws_conn`
--conn-type - Specifies the type of connection, here 'aws' for Amazon Web Services
--conn-login - Sets the username or access key for the connection
--conn-password - Sets the password or secret key for the connection
--conn-extra - Adds extra JSON-formatted parameters like region
This command retrieves and shows the details of the 'my_aws_conn' connection to verify it was added correctly.
Terminal
airflow connections get my_aws_conn
Expected OutputExpected
Conn Id: my_aws_conn Conn Type: aws Host: None Login: AWS_ACCESS_KEY_ID Password: AWS_SECRET_ACCESS_KEY Schema: None Port: None Extra: {"region_name": "us-east-1"}
This command runs a task from your Airflow DAG that uses the 'my_aws_conn' connection to access AWS services during execution.
Terminal
airflow tasks run example_dag example_task 2024-06-01
Expected OutputExpected
[2024-06-01 12:00:00,000] {taskinstance.py:1234} INFO - Executing task example_task using connection my_aws_conn [2024-06-01 12:00:05,000] {taskinstance.py:5678} INFO - Task completed successfully
This command deletes the 'my_aws_conn' connection when it is no longer needed, cleaning up your Airflow environment.
Terminal
airflow connections delete my_aws_conn
Expected OutputExpected
Deleted connection `my_aws_conn`
Key Concept

If you remember nothing else from connection management, remember: store your cloud credentials securely in Airflow connections to reuse them safely across workflows.

Common Mistakes
Hardcoding cloud credentials directly in DAG code
This exposes sensitive information and makes updating credentials difficult and risky
Use Airflow connections to store credentials securely and reference them in your DAGs
Using incorrect connection IDs or typos when referencing connections in tasks
Tasks will fail because they cannot find the connection details
Double-check connection IDs and use airflow connections get to verify them
Not setting required extra parameters like region in the connection
Cloud service clients may fail to connect or use wrong defaults
Include all necessary extra JSON parameters when creating connections
Summary
Use 'airflow connections add' to securely store cloud service credentials and settings.
Verify connections with 'airflow connections get' before running workflows.
Reference these connections in your DAG tasks to avoid hardcoding sensitive data.
Clean up unused connections with 'airflow connections delete' to keep your environment tidy.