Hadoopdata~5 mins

YARN scheduling policies in Hadoop - Time & Space Complexity

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Time Complexity: YARN scheduling policies

O(q * a * r)

Understanding Time Complexity

We want to understand how the time to schedule tasks grows as more jobs and resources are involved in YARN.

How does the scheduler handle more work and what affects its speed?

Scenario Under Consideration

Analyze the time complexity of the YARN Capacity Scheduler's allocation loop.


for each queue in queues:
  for each application in queue.applications:
    for each container request in application.requests:
      allocate container if resources available

This code tries to allocate containers to applications in each queue based on resource availability.

Identify Repeating Operations

Look at the loops that repeat work:

Primary operation: Nested loops over queues, applications, and container requests.
How many times: Number of queues (q), times number of applications per queue (a), times container requests per application (r).

How Execution Grows With Input

As the number of queues, applications, and requests grow, the scheduler does more work.

Input Size (q, a, r)	Approx. Operations
10 queues, 5 apps, 3 requests	10 * 5 * 3 = 150
100 queues, 10 apps, 5 requests	100 * 10 * 5 = 5,000
1000 queues, 20 apps, 10 requests	1000 * 20 * 10 = 200,000

Pattern observation: The work grows quickly as more queues, applications, or requests are added.

Final Time Complexity

Time Complexity: O(q * a * r)

This means the scheduling time grows proportionally to the number of queues, applications, and container requests.

Common Mistake

[X] Wrong: "The scheduler time only depends on the number of applications."

[OK] Correct: The scheduler also loops over queues and container requests, so all three affect the time.

Interview Connect

Understanding how scheduling time grows helps you design better resource managers and explain system behavior clearly.

Self-Check

"What if the scheduler used a priority queue to pick applications instead of looping over all? How would the time complexity change?"