0
0
Hadoopdata~20 mins

YARN scheduling policies in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
YARN Scheduler Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding YARN Capacity Scheduler Queues

In YARN's Capacity Scheduler, what is the primary purpose of defining multiple queues?

ATo allocate cluster resources among different organizations or teams fairly
BTo increase the total number of nodes in the cluster automatically
CTo reduce the memory usage of the ResourceManager
DTo speed up the execution of a single job by splitting it into multiple tasks
Attempts:
2 left
💡 Hint

Think about how resources are shared among users or groups in a multi-tenant environment.

Predict Output
intermediate
2:00remaining
Output of YARN Fair Scheduler Allocation Calculation

Given the following simplified resource allocation snippet in YARN Fair Scheduler, what is the value of allocatedMemory after execution?

Hadoop
allocatedMemory = 0
queueCapacities = {'queueA': 40, 'queueB': 60}
clusterMemory = 10000
for queue, capacity in queueCapacities.items():
    allocatedMemory += clusterMemory * (capacity / 100)
print(allocatedMemory)
A4000
B10000
C6000
D0
Attempts:
2 left
💡 Hint

Sum the memory allocated to each queue based on their capacity percentages.

data_output
advanced
2:00remaining
YARN Scheduler Resource Allocation Table

Consider a YARN cluster with 3 queues: Q1, Q2, Q3. The cluster has 120 GB memory. The Capacity Scheduler assigns capacities as 50%, 30%, and 20% respectively. If Q1 uses 40 GB, Q2 uses 20 GB, and Q3 uses 10 GB, what is the remaining available memory per queue?

Hadoop
cluster_memory = 120
queue_capacities = {'Q1': 0.5, 'Q2': 0.3, 'Q3': 0.2}
queue_usage = {'Q1': 40, 'Q2': 20, 'Q3': 10}
remaining_memory = {}
for q in queue_capacities:
    max_alloc = cluster_memory * queue_capacities[q]
    remaining_memory[q] = max_alloc - queue_usage[q]
print(remaining_memory)
A{'Q1': 20.0, 'Q2': 10.0, 'Q3': 10.0}
B{'Q1': 40.0, 'Q2': 10.0, 'Q3': 20.0}
C{'Q1': 20.0, 'Q2': 16.0, 'Q3': 14.0}
D{'Q1': 30.0, 'Q2': 16.0, 'Q3': 14.0}
Attempts:
2 left
💡 Hint

Calculate max allocation per queue and subtract current usage.

🔧 Debug
advanced
2:00remaining
Identify the Error in YARN FIFO Scheduler Code

What error will the following YARN FIFO Scheduler code snippet produce?

Hadoop
jobs = ['job1', 'job2', 'job3']
for i in range(len(jobs)):
    print(jobs[i+1])
AIndexError: list index out of range
BSyntaxError: invalid syntax
CTypeError: 'int' object is not iterable
DNo error, prints all jobs
Attempts:
2 left
💡 Hint

Check the loop index and list access carefully.

🚀 Application
expert
3:00remaining
Choosing the Best Scheduler for a Mixed Workload

A company runs both long-running batch jobs and short interactive jobs on a YARN cluster. Which scheduler policy is best suited to ensure fair resource sharing and low latency for interactive jobs?

ANone, use external resource manager instead
BFIFO Scheduler to process jobs in submission order
CCapacity Scheduler with strict queue capacities
DFair Scheduler to dynamically share resources among jobs
Attempts:
2 left
💡 Hint

Consider which scheduler balances fairness and responsiveness.