Application lifecycle in YARN in Hadoop - Time & Space Complexity
We want to understand how the time to manage an application in YARN changes as the application size grows.
How does YARN handle starting, running, and finishing an application as it gets bigger?
Analyze the time complexity of the following simplified YARN application lifecycle code.
// Simplified YARN application lifecycle
ApplicationMaster am = new ApplicationMaster();
am.registerApplicationMaster();
for (Container container : containers) {
am.launchContainer(container);
}
am.monitorContainers();
am.unregisterApplicationMaster();
This code shows how the ApplicationMaster registers, launches containers one by one, monitors them, and then unregisters.
Look for loops or repeated steps in the lifecycle.
- Primary operation: Loop over all containers to launch them.
- How many times: Once for each container in the application.
As the number of containers grows, the time to launch and monitor them grows too.
| Input Size (n containers) | Approx. Operations |
|---|---|
| 10 | About 10 container launches and monitoring steps |
| 100 | About 100 container launches and monitoring steps |
| 1000 | About 1000 container launches and monitoring steps |
Pattern observation: The work grows directly with the number of containers.
Time Complexity: O(n)
This means the time to manage the application grows in a straight line as the number of containers increases.
[X] Wrong: "Launching containers happens all at once, so time stays the same no matter how many containers there are."
[OK] Correct: Each container launch is a separate step, so more containers mean more work and more time.
Understanding how YARN handles applications step-by-step helps you explain resource management clearly and confidently.
"What if the ApplicationMaster launched containers in parallel instead of one by one? How would the time complexity change?"