Cost optimization in Kubernetes - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When managing Kubernetes clusters, it's important to understand how resource usage grows as you add more workloads.
We want to know how the cost of running and scaling workloads changes as the number of pods or nodes increases.
Analyze the time complexity of this Kubernetes Horizontal Pod Autoscaler (HPA) configuration.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: example-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
This HPA automatically adjusts the number of pods based on CPU usage, between 1 and 10 pods.
Look at what repeats as the cluster scales.
- Primary operation: Monitoring CPU usage of each pod and adjusting pod count.
- How many times: Once per pod, repeated periodically to check metrics and scale.
As the number of pods increases, the system checks CPU usage for each pod to decide scaling.
| Input Size (pods) | Approx. Operations (CPU checks) |
|---|---|
| 10 | 10 checks |
| 100 | 100 checks |
| 1000 | 1000 checks |
Pattern observation: The number of CPU usage checks grows directly with the number of pods.
Time Complexity: O(n)
This means the cost to monitor and scale grows linearly as you add more pods.
[X] Wrong: "Scaling cost stays the same no matter how many pods run."
[OK] Correct: Each pod adds monitoring overhead, so more pods mean more work to check and adjust resources.
Understanding how resource checks grow with workload size helps you design efficient scaling strategies in Kubernetes.
What if the HPA monitored multiple metrics (CPU and memory) instead of just CPU? How would the time complexity change?
Practice
requests and limits on Kubernetes pods for cost optimization?Solution
Step 1: Understand resource requests and limits
Requests define minimum resources a pod needs; limits set maximum usage.Step 2: Link resource control to cost optimization
By setting these, Kubernetes schedules pods efficiently and avoids resource waste.Final Answer:
To control how much CPU and memory a pod can use, preventing waste -> Option BQuick Check:
Resource limits prevent waste = C [OK]
- Thinking limits increase pod count
- Confusing requests with autoscaling
- Assuming unlimited resources save money
Solution
Step 1: Check correct YAML structure for resources
Requests and limits must be underresources, with proper indentation and units.Step 2: Validate units and order
CPU request '500m' means 0.5 CPU; memory limit '256Mi' is correct unit. resources:\n requests:\n cpu: '500m'\n limits:\n memory: '256Mi' matches this.Final Answer:
resources:\n requests:\n cpu: '500m'\n limits:\n memory: '256Mi' -> Option AQuick Check:
Correct YAML with proper units = B [OK]
- Swapping requests and limits
- Using wrong units like 'MB' instead of 'Mi'
- Omitting quotes around values
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
What happens when CPU usage exceeds 50%?
Solution
Step 1: Understand HPA behavior with CPU utilization
HPA increases pod count when average CPU usage exceeds target utilization (50%).Step 2: Check min and max replicas
Pods scale between 2 and 5 replicas based on load; exceeding 50% triggers scaling up.Final Answer:
The number of pods increases up to 5 to handle load -> Option CQuick Check:
CPU > 50% triggers scale up = A [OK]
- Thinking pods scale down on high CPU
- Assuming pods restart on high CPU
- Believing CPU limits auto-increase
Solution
Step 1: Analyze autoscaling parameters
A highminReplicasprevents scaling below that number, causing overspending.Step 2: Evaluate other options
Low limits or readiness probes don't directly prevent scaling down; CPU requests > limits is invalid.Final Answer:
The Horizontal Pod Autoscaler has a highminReplicasvalue -> Option DQuick Check:
High minReplicas blocks scale down = A [OK]
- Confusing limits with requests
- Ignoring minReplicas effect
- Assuming readinessProbe affects scaling
Solution
Step 1: Understand cluster autoscaling
Cluster Autoscaler adjusts node count based on pod scheduling needs and resource requests.Step 2: Importance of pod resource requests and limits
Proper requests and limits let the autoscaler know actual resource needs to scale nodes efficiently.Step 3: Evaluate other options
Manual scaling wastes resources; disabling HPA or zero limits causes inefficiency or errors.Final Answer:
Cluster Autoscaler with properly set pod resource requests and limits -> Option AQuick Check:
Autoscaler + resource requests = cost savings [OK]
- Relying on manual scaling only
- Disabling autoscaling features
- Setting resource limits to zero
