0
0
Hadoopdata~5 mins

Apache Ranger for authorization in Hadoop

Choose your learning style9 modes available
Introduction

Apache Ranger helps control who can access data in big systems. It keeps data safe by managing permissions easily.

You want to decide who can read or write data in Hadoop.
You need to track who accessed sensitive data for security.
You want to set rules for different users or groups in your data system.
You want a simple way to manage access across many data tools.
You need to quickly update permissions without changing code.
Syntax
Hadoop
Policies are created in Apache Ranger UI or via REST API.
A policy includes:
- Service (like HDFS, Hive)
- Resources (like folders, tables)
- Users or groups
- Permissions (read, write, execute)

Example policy JSON structure:
{
  "policyName": "example_policy",
  "service": "hdfs",
  "resources": {
    "path": "/user/data"
  },
  "users": ["alice"],
  "permissions": ["read", "write"]
}

Policies are usually managed through the Ranger web interface for ease.

Permissions can be fine-tuned for different resources and user groups.

Examples
This policy lets user 'bob' read files in the /data/reports folder.
Hadoop
Policy for HDFS folder:
{
  "policyName": "data_read_policy",
  "service": "hdfs",
  "resources": {"path": "/data/reports"},
  "users": ["bob"],
  "permissions": ["read"]
}
This policy allows the 'analysts' group to query the 'transactions' table in Hive.
Hadoop
Policy for Hive table:
{
  "policyName": "sales_table_access",
  "service": "hive",
  "resources": {"database": "sales_db", "table": "transactions"},
  "groups": ["analysts"],
  "permissions": ["select"]
}
Sample Program

This code sends a request to Apache Ranger to check if user 'alice' has read access to a folder in HDFS. It prints whether access is allowed.

Hadoop
# This is a conceptual example showing how to check access using Ranger REST API
import requests

# Ranger REST API endpoint for policy check
url = "http://ranger-server:6080/service/public/api/secure/policy/check"

# Example data to check if user 'alice' can read /user/data
payload = {
    "serviceName": "hdfs",
    "resourcePath": "/user/data",
    "user": "alice",
    "accessType": "read"
}

response = requests.post(url, json=payload)

if response.status_code == 200:
    result = response.json()
    print(f"Access allowed: {result.get('allowed', False)}")
else:
    print(f"Error checking access: {response.status_code}")
OutputSuccess
Important Notes

Apache Ranger centralizes access control for many big data tools.

Always test policies with Ranger's audit logs to verify correct permissions.

Use groups to simplify managing many users' permissions.

Summary

Apache Ranger manages who can access data in big systems.

Policies define permissions for users or groups on resources.

Use Ranger UI or API to create and check access rules easily.