Node affinity and anti-affinity in Kubernetes - Time & Space Complexity
We want to understand how the time to schedule pods changes when using node affinity and anti-affinity rules in Kubernetes.
Specifically, how does the scheduler's work grow as the number of nodes and pods increases?
Analyze the time complexity of this node affinity and anti-affinity snippet.
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- frontend
topologyKey: kubernetes.io/hostname
This pod requires nodes labeled with "disktype=ssd" and avoids nodes with pods labeled "app=frontend" on the same host.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The scheduler checks each node's labels and running pods to match affinity and anti-affinity rules.
- How many times: It repeats this check for every node and for pods on those nodes.
As the number of nodes (n) and pods per node (m) grow, the scheduler must check more labels and pods.
| Input Size (n nodes, m pods/node) | Approx. Operations |
|---|---|
| 10 nodes, 5 pods/node | ~50 checks |
| 100 nodes, 10 pods/node | ~1000 checks |
| 1000 nodes, 20 pods/node | ~20,000 checks |
Pattern observation: The checks grow roughly with the number of nodes times the pods per node.
Time Complexity: O(n * m)
This means the scheduler's work grows proportionally with the number of nodes and pods it must evaluate.
[X] Wrong: "The scheduler only checks node labels once, so time stays the same no matter how many pods or nodes there are."
[OK] Correct: The scheduler must check each node and the pods on it to apply affinity and anti-affinity rules, so more nodes and pods mean more checks.
Understanding how scheduling time grows helps you explain how Kubernetes handles workloads efficiently and what might slow it down as clusters grow.
What if we changed requiredDuringSchedulingIgnoredDuringExecution to preferredDuringSchedulingIgnoredDuringExecution? How would the time complexity change?