0
0
Hadoopdata~10 mins

YARN scheduling policies in Hadoop - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - YARN scheduling policies
Job arrives in YARN
Scheduler selects policy
FIFO: Jobs run in order
Capacity: Resources split by queues
Fair: Resources shared fairly
Allocate resources to containers
Run tasks
Job completes or waits
YARN receives jobs and chooses a scheduling policy (FIFO, Capacity, or Fair) to allocate resources and run tasks.
Execution Sample
Hadoop
jobs = [Job1, Job2, Job3]
scheduler = 'FIFO'
for job in jobs:
    allocate_resources(job)
    run(job)
This code runs jobs one by one in the order they arrive using FIFO scheduling.
Execution Table
StepJobScheduler PolicyActionResources AllocatedStatus
1Job1FIFOAllocate resources4 containersRunning
2Job1FIFORun tasks4 containersRunning
3Job1FIFOComplete0 containersFinished
4Job2FIFOAllocate resources3 containersRunning
5Job2FIFORun tasks3 containersRunning
6Job2FIFOComplete0 containersFinished
7Job3FIFOAllocate resources2 containersRunning
8Job3FIFORun tasks2 containersRunning
9Job3FIFOComplete0 containersFinished
💡 All jobs completed in FIFO order, no more jobs to schedule.
Variable Tracker
VariableStartAfter Step 1After Step 4After Step 7Final
Job1 StatusWaitingRunningFinishedFinishedFinished
Job2 StatusWaitingWaitingRunningFinishedFinished
Job3 StatusWaitingWaitingWaitingRunningFinished
Resources Allocated04320
Key Moments - 2 Insights
Why does Job2 wait until Job1 finishes in FIFO scheduling?
Because FIFO runs jobs strictly in arrival order, Job2 cannot start until Job1 completes, as shown in steps 1-6 in the execution_table.
How does resource allocation differ between Capacity and Fair schedulers?
Capacity divides resources by fixed queues, while Fair shares resources dynamically to balance usage. This differs from FIFO's one-job-at-a-time approach.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the status of Job2 at Step 3?
AWaiting
BRunning
CFinished
DNot started
💡 Hint
Check the 'Status' column for Job2 at Step 3 in the execution_table.
At which step does Job3 start running in FIFO scheduling?
AStep 4
BStep 7
CStep 6
DStep 9
💡 Hint
Look at the 'Action' and 'Status' columns for Job3 in the execution_table.
If we switch to Fair scheduler, how would resource allocation change?
AJobs run one by one with full resources
BOnly the first job gets resources
CResources are shared among jobs simultaneously
DJobs run in reverse order
💡 Hint
Fair scheduler shares resources fairly among running jobs, unlike FIFO.
Concept Snapshot
YARN Scheduling Policies:
- FIFO: Jobs run one after another in arrival order.
- Capacity: Resources split by queues with fixed capacity.
- Fair: Resources shared dynamically for fairness.
- Scheduler allocates containers to run tasks.
- Choice affects job wait and resource use.
Full Transcript
YARN scheduling policies decide how jobs get resources to run. FIFO runs jobs one by one in order. Capacity scheduler splits resources by queues with set limits. Fair scheduler shares resources evenly among jobs. This example shows FIFO running three jobs in order, allocating containers, running tasks, and finishing each job before starting the next. Variables track job status and resource use. Key points include why jobs wait in FIFO and how resource sharing differs in other policies. Quizzes check understanding of job status at steps and scheduler behavior.