0
0
Hadoopdata~5 mins

YARN scheduling policies in Hadoop

Choose your learning style9 modes available
Introduction

YARN scheduling policies decide how computer resources are shared among tasks. This helps run many jobs smoothly without waiting too long.

When multiple users submit jobs to a Hadoop cluster at the same time.
When you want to make sure important jobs get resources first.
When you want to share resources fairly among all running jobs.
When you want to control how much resource each job can use.
When you want to improve cluster utilization and reduce job wait time.
Syntax
Hadoop
scheduler-type: <policy_name>

Common policies:
- FIFO
- Capacity
- Fair

Set the scheduler type in YARN configuration files like yarn-site.xml.

Each policy has different rules for allocating resources.

Examples
First In First Out: Jobs run in the order they arrive.
Hadoop
scheduler-type: FIFO
Capacity Scheduler: Divides cluster into queues with set resource limits.
Hadoop
scheduler-type: Capacity
Fair Scheduler: Shares resources evenly among all jobs over time.
Hadoop
scheduler-type: Fair
Sample Program

This example shows how to check and change the YARN scheduler using a Python client library. It prints the current scheduler, changes it to Fair Scheduler, then prints the new scheduler.

Hadoop
from pyhadoop import YarnClient

# Connect to YARN cluster
client = YarnClient()

# Check current scheduler
current_scheduler = client.get_scheduler()
print(f"Current scheduler: {current_scheduler}")

# Change scheduler to Fair Scheduler
client.set_scheduler('Fair')

# Verify change
new_scheduler = client.get_scheduler()
print(f"Scheduler changed to: {new_scheduler}")
OutputSuccess
Important Notes

FIFO is simple but can cause long waits if big jobs run first.

Capacity Scheduler is good for multi-tenant clusters with fixed resource shares.

Fair Scheduler tries to balance resource use so no job waits too long.

Summary

YARN scheduling policies control how cluster resources are shared.

Common policies are FIFO, Capacity, and Fair Scheduler.

Choosing the right policy helps run jobs efficiently and fairly.