0
0
AirflowHow-ToBeginner · 4 min read

How to Configure Logging in Apache Airflow

To configure logging in Airflow, edit the airflow.cfg file to set logging parameters like base_log_folder and remote_logging. For advanced control, customize the logging_config_class to point to a Python logging configuration that defines handlers, formatters, and loggers.
📐

Syntax

The main logging configuration in Airflow is done in the airflow.cfg file under the [logging] section. Key parameters include:

  • base_log_folder: Directory where logs are stored locally.
  • remote_logging: Enable or disable remote log storage (e.g., S3, GCS).
  • remote_base_log_folder: Remote storage location for logs.
  • logging_config_class: Python path to a logging configuration class for advanced setups.

The logging_config_class points to a Python dictionary that defines loggers, handlers, and formatters using the standard Python logging module syntax.

ini
[logging]
base_log_folder = /path/to/airflow/logs
remote_logging = False
remote_base_log_folder = 
logging_config_class = airflow.utils.log.logging_mixin.DEFAULT_LOGGING_CONFIG
💻

Example

This example shows how to enable remote logging to an S3 bucket and customize the logging configuration by creating a Python file with logging settings.

python
# airflow_local_settings.py
import logging

LOGGING_CONFIG = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'airflow': {
            'format': '[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s',
            'datefmt': '%Y-%m-%d %H:%M:%S'
        },
    },
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'formatter': 'airflow',
            'stream': 'ext://sys.stdout',
        },
        'task': {
            'class': 'logging.FileHandler',
            'formatter': 'airflow',
            'filename': '/tmp/airflow_task.log',
        },
    },
    'loggers': {
        'airflow.task': {
            'handlers': ['task', 'console'],
            'level': 'INFO',
            'propagate': False,
        },
    },
}

# airflow.cfg changes:
# [logging]
# remote_logging = True
# remote_base_log_folder = s3://my-airflow-logs
# logging_config_class = airflow_local_settings.LOGGING_CONFIG
⚠️

Common Pitfalls

Common mistakes when configuring Airflow logging include:

  • Not setting remote_logging to True when using remote storage, causing logs to be saved only locally.
  • Incorrect paths in base_log_folder or remote_base_log_folder leading to missing logs.
  • Misconfiguring the logging_config_class path or Python logging dictionary causing Airflow to fallback to default logging silently.
  • Forgetting to restart Airflow services after changing logging settings.

Example of a wrong and right setting:

# Wrong (remote_logging enabled but no remote path)
remote_logging = True
remote_base_log_folder = 

# Right
remote_logging = True
remote_base_log_folder = s3://my-airflow-logs
📊

Quick Reference

Summary tips for Airflow logging configuration:

  • Always set base_log_folder to a writable directory.
  • Enable remote_logging and set remote_base_log_folder for centralized log storage.
  • Use logging_config_class to customize log formats and handlers.
  • Restart Airflow scheduler and webserver after changes.

Key Takeaways

Configure logging mainly via the [logging] section in airflow.cfg.
Use remote_logging and remote_base_log_folder to store logs centrally.
Customize logging_config_class with a Python dict for advanced logging control.
Always restart Airflow services after changing logging settings.
Check paths and permissions to avoid missing or inaccessible logs.