0
0
Hadoopdata~5 mins

Audit logging in Hadoop

Choose your learning style9 modes available
Introduction

Audit logging helps you keep track of who did what and when in your Hadoop system. It is useful for security and troubleshooting.

You want to see who accessed or changed files in HDFS.
You need to check if someone tried to do something they shouldn't.
You want to keep a record of user actions for compliance rules.
You are troubleshooting problems and want to know the sequence of events.
You want to monitor system usage patterns over time.
Syntax
Hadoop
hadoop.security.authorization=true
hadoop.security.audit.logger=org.apache.hadoop.security.authorize.AuditLogger
hadoop.security.audit.loggers=org.apache.hadoop.security.authorize.AuditLogger
These settings go into the core-site.xml or hdfs-site.xml configuration files.
Make sure to restart Hadoop services after changing these settings.
Examples
This enables security authorization in Hadoop, which is needed for audit logging.
Hadoop
<property>
  <name>hadoop.security.authorization</name>
  <value>true</value>
</property>
This sets the class that handles audit logging.
Hadoop
<property>
  <name>hadoop.security.audit.logger</name>
  <value>org.apache.hadoop.security.authorize.AuditLogger</value>
</property>
This configures the audit loggers to use.
Hadoop
<property>
  <name>hadoop.security.audit.loggers</name>
  <value>org.apache.hadoop.security.authorize.AuditLogger</value>
</property>
Sample Program

This is a complete example of the audit logging configuration in an XML format for Hadoop's core-site.xml or hdfs-site.xml file.

Hadoop
<configuration>
  <property>
    <name>hadoop.security.authorization</name>
    <value>true</value>
  </property>
  <property>
    <name>hadoop.security.audit.logger</name>
    <value>org.apache.hadoop.security.authorize.AuditLogger</value>
  </property>
  <property>
    <name>hadoop.security.audit.loggers</name>
    <value>org.apache.hadoop.security.authorize.AuditLogger</value>
  </property>
</configuration>
OutputSuccess
Important Notes

Audit logs are usually stored in the Hadoop log directory, check your log4j or logback configuration for exact location.

Audit logging can generate a lot of data, so monitor disk space and log rotation settings.

Ensure your Hadoop cluster has proper permissions so audit logs cannot be tampered with.

Summary

Audit logging tracks user actions in Hadoop for security and troubleshooting.

Enable it by setting hadoop.security.authorization to true and configuring audit logger classes.

Audit logs help you see who accessed or changed data and when.