0
0
AirflowHow-ToBeginner ยท 3 min read

How to Use Remote Logging in Apache Airflow

To use remote logging in Apache Airflow, enable it in the airflow.cfg by setting remote_logging = True and configure the remote log storage backend like S3 or GCS. This sends task logs to external storage, making logs accessible outside the Airflow worker machines.
๐Ÿ“

Syntax

Remote logging in Airflow is configured in the airflow.cfg file under the [logging] section. Key settings include:

  • remote_logging: Enables or disables remote logging.
  • remote_log_conn_id: Connection ID for the remote storage service (e.g., S3, GCS).
  • remote_base_log_folder: The remote storage path where logs are saved.
  • logging_config_class: Optional Python logging config class for advanced setups.
ini
[logging]
remote_logging = True
remote_log_conn_id = MyS3Conn
remote_base_log_folder = s3://my-airflow-logs
logging_config_class = my_airflow_config.LOGGING_CONFIG
๐Ÿ’ป

Example

This example shows how to enable remote logging to Amazon S3 in Airflow by editing airflow.cfg and setting up an S3 connection in Airflow UI or CLI.

ini
[logging]
remote_logging = True
remote_log_conn_id = MyS3Conn
remote_base_log_folder = s3://my-airflow-logs

# In Airflow UI or CLI, create a connection named 'MyS3Conn' with your AWS credentials.

# After this, task logs will be uploaded to the specified S3 bucket automatically.
Output
Logs for tasks will appear in the S3 bucket under the path my-airflow-logs/ by DAG and task instance.
โš ๏ธ

Common Pitfalls

  • Not enabling remote_logging: Logs stay local and are not uploaded.
  • Incorrect connection ID: Airflow cannot authenticate to remote storage.
  • Wrong remote_base_log_folder path: Logs fail to upload or are saved in unexpected locations.
  • Missing permissions: The remote storage credentials must have write access.
  • Forgetting to restart Airflow: Changes in airflow.cfg require a restart to take effect.
ini
Wrong example:
[logging]
remote_logging = False
remote_log_conn_id = MyS3Conn
remote_base_log_folder = s3://my-airflow-logs

Correct example:
[logging]
remote_logging = True
remote_log_conn_id = MyS3Conn
remote_base_log_folder = s3://my-airflow-logs
๐Ÿ“Š

Quick Reference

SettingDescriptionExample Value
remote_loggingEnable remote loggingTrue
remote_log_conn_idConnection ID for remote storageMyS3Conn
remote_base_log_folderRemote storage path for logss3://my-airflow-logs
logging_config_classCustom logging config class (optional)my_airflow_config.LOGGING_CONFIG
โœ…

Key Takeaways

Enable remote_logging = True in airflow.cfg to activate remote logging.
Set remote_log_conn_id to a valid connection with credentials for your remote storage.
Configure remote_base_log_folder to point to your remote storage location (e.g., S3 bucket).
Ensure your remote storage credentials have write permissions for logs.
Restart Airflow services after changing logging configuration to apply changes.