Cluster upgrade strategies in Kubernetes - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When upgrading a Kubernetes cluster, it is important to understand how the time taken grows as the cluster size increases.
We want to know how the upgrade process scales with the number of nodes.
Analyze the time complexity of this rolling upgrade strategy.
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-app
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
replicas: 5
template:
spec:
containers:
- name: app-container
image: example/app:v2
This snippet shows a rolling update where pods are updated one by one to the new version.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Updating each pod one at a time in sequence.
- How many times: Once per pod, so equal to the number of pods (n).
As the number of pods increases, the upgrade time grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 pod updates |
| 100 | 100 pod updates |
| 1000 | 1000 pod updates |
Pattern observation: Doubling the number of pods roughly doubles the upgrade time.
Time Complexity: O(n)
This means the upgrade time grows linearly with the number of pods in the cluster.
[X] Wrong: "Upgrading multiple pods at once always makes the upgrade time constant regardless of cluster size."
[OK] Correct: Even if some pods update simultaneously, the total time still depends on how many pods need updating and how many can update at once.
Understanding how upgrade time scales helps you plan cluster maintenance and avoid downtime, a key skill in real-world Kubernetes management.
"What if we increase maxUnavailable to 3? How would the time complexity change?"
Practice
Solution
Step 1: Understand the role of control plane nodes
Control plane nodes manage the cluster state and API server, so they must be stable first.Step 2: Upgrade worker nodes after control plane
Worker nodes run workloads and depend on the control plane, so upgrade them after control plane nodes.Final Answer:
Upgrade control plane nodes first, then worker nodes -> Option CQuick Check:
Control plane first, workers second = A [OK]
- Upgrading worker nodes before control plane
- Upgrading all nodes at once causing downtime
- Skipping control plane upgrade
Solution
Step 1: Identify the correct drain command syntax
The command to safely evict pods is 'kubectl drain' with flags to ignore daemonsets and delete local data.Step 2: Verify other options are incorrect
Upgrade and delete commands do not drain nodes; cordon only marks unschedulable but does not evict pods.Final Answer:
kubectl drain <node-name> --ignore-daemonsets --delete-local-data -> Option AQuick Check:
Drain command with correct flags = A [OK]
- Using 'kubectl cordon' instead of 'drain'
- Deleting nodes instead of draining
- Missing flags causing pod eviction failure
1. Drain node1 2. Upgrade node1 3. Uncordon node1 4. Repeat for node2 and node3
Solution
Step 1: Analyze the upgrade steps
Each node is drained to safely evict pods, upgraded, then uncordoned to resume scheduling.Step 2: Understand impact on cluster availability
Upgrading nodes one by one with draining keeps workloads running on other nodes, minimizing downtime.Final Answer:
Cluster remains available with minimal downtime -> Option BQuick Check:
Draining and upgrading nodes one by one = D [OK]
- Assuming cluster goes down during upgrades
- Not draining nodes causing pod failures
- Upgrading all nodes simultaneously
kubectl drain node1 but pods did not evict. What is the likely cause?Solution
Step 1: Understand drain behavior with DaemonSets
By default, drain blocks if DaemonSet pods are running unless --ignore-daemonsets is used.Step 2: Check other options for correctness
Uncordon status does not block eviction; control plane nodes can be drained; pods without local storage do not block drain.Final Answer:
DaemonSet pods are blocking eviction -> Option AQuick Check:
DaemonSet pods block drain without flag = C [OK]
- Not using --ignore-daemonsets flag
- Confusing cordon with drain
- Assuming control plane nodes cannot be drained
Solution
Step 1: Consider cloud provider tools for control plane upgrade
Cloud tools often automate safe control plane upgrades reducing manual errors.Step 2: Upgrade worker nodes one by one with drain/un-cordon
This approach avoids downtime by keeping workloads running on other nodes during upgrade.Step 3: Evaluate other options for risks
Upgrading all nodes simultaneously or skipping drain risks downtime and pod failures.Final Answer:
Use cloud provider upgrade tools to upgrade control plane, then drain and upgrade workers one by one -> Option DQuick Check:
Cloud tools + sequential worker upgrade = B [OK]
- Upgrading all nodes simultaneously causing downtime
- Skipping drain causing pod disruption
- Ignoring cloud provider upgrade features
