Bird
Raised Fist0
Kubernetesdevops~5 mins

Cost optimization in Kubernetes - Commands & Configuration

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Running applications on Kubernetes can use more resources than needed, which costs money. Cost optimization helps you use just the right amount of resources so you save money without hurting your apps.
When your Kubernetes cluster bills are higher than expected and you want to reduce costs.
When you want to avoid paying for unused CPU or memory in your app containers.
When you want to automatically adjust resources based on app demand to save money.
When you want to find and remove idle or unnecessary workloads in your cluster.
When you want to set limits so no app uses more resources than it should.
Commands
Check current CPU and memory usage of all nodes to see if any are underused.
Terminal
kubectl top nodes
Expected OutputExpected
NAME CPU(cores) MEMORY(bytes) worker-node-1 250m 512Mi worker-node-2 100m 256Mi worker-node-3 50m 128Mi
See resource usage of all pods to find which ones use too much or too little CPU and memory.
Terminal
kubectl top pods --all-namespaces
Expected OutputExpected
NAMESPACE NAME CPU(cores) MEMORY(bytes) default my-app-1 100m 200Mi default my-app-2 50m 150Mi kube-system coredns-558bd4d5db-7x9zq 10m 50Mi
--all-namespaces - Shows pods from all namespaces, not just the current one
Save the pod configuration to a file so you can edit resource requests and limits.
Terminal
kubectl get pod my-app-1 -o yaml > my-app-1.yaml
Expected OutputExpected
No output (command runs silently)
-o yaml - Outputs the pod details in YAML format
Apply the updated pod configuration with optimized resource requests and limits to save costs.
Terminal
kubectl apply -f my-app-1.yaml
Expected OutputExpected
pod/my-app-1 configured
Check if Horizontal Pod Autoscaler is set up to automatically adjust pod count based on load.
Terminal
kubectl get hpa
Expected OutputExpected
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE my-app-hpa Deployment/my-app 50%/80% 1 5 2 10m
Key Concept

If you remember nothing else from this pattern, remember: monitor your resource usage and set proper requests and limits to avoid paying for unused capacity.

Common Mistakes
Not setting resource requests and limits in pod specs.
Without limits, pods can use more resources than expected, causing higher costs and unstable clusters.
Always define CPU and memory requests and limits in your pod or deployment YAML files.
Ignoring idle or low-usage nodes and pods.
Idle resources waste money because you pay for capacity that is not used.
Regularly check resource usage with kubectl top and remove or scale down unused workloads.
Not using autoscaling features like Horizontal Pod Autoscaler.
Without autoscaling, you either over-provision (waste money) or under-provision (poor performance).
Set up autoscaling to adjust pod counts based on real demand automatically.
Summary
Use 'kubectl top' commands to monitor node and pod resource usage.
Edit pod resource requests and limits in YAML files to control how much CPU and memory each pod can use.
Apply changes with 'kubectl apply' to update running workloads.
Use Horizontal Pod Autoscaler to automatically scale pods based on load and save costs.

Practice

(1/5)
1. What is the main purpose of setting resource requests and limits on Kubernetes pods for cost optimization?
easy
A. To disable autoscaling features in the cluster
B. To control how much CPU and memory a pod can use, preventing waste
C. To increase the number of pods running simultaneously
D. To allow pods to use unlimited resources

Solution

  1. Step 1: Understand resource requests and limits

    Requests define minimum resources a pod needs; limits set maximum usage.
  2. Step 2: Link resource control to cost optimization

    By setting these, Kubernetes schedules pods efficiently and avoids resource waste.
  3. Final Answer:

    To control how much CPU and memory a pod can use, preventing waste -> Option B
  4. Quick Check:

    Resource limits prevent waste = C [OK]
Hint: Requests and limits control pod resource use to save costs [OK]
Common Mistakes:
  • Thinking limits increase pod count
  • Confusing requests with autoscaling
  • Assuming unlimited resources save money
2. Which of the following is the correct YAML snippet to set a CPU request of 500m and a memory limit of 256Mi for a container in Kubernetes?
easy
A. resources:\n requests:\n cpu: '500m'\n limits:\n memory: '256Mi'
B. resources:\n limits:\n cpu: '500m'\n requests:\n memory: '256Mi'
C. resources:\n requests:\n cpu: 500\n memory: 256
D. resources:\n requests:\n cpu: '0.5'\n limits:\n memory: '256MB'

Solution

  1. Step 1: Check correct YAML structure for resources

    Requests and limits must be under resources, with proper indentation and units.
  2. Step 2: Validate units and order

    CPU request '500m' means 0.5 CPU; memory limit '256Mi' is correct unit. resources:\n requests:\n cpu: '500m'\n limits:\n memory: '256Mi' matches this.
  3. Final Answer:

    resources:\n requests:\n cpu: '500m'\n limits:\n memory: '256Mi' -> Option A
  4. Quick Check:

    Correct YAML with proper units = B [OK]
Hint: Requests before limits, use 'm' for CPU and 'Mi' for memory [OK]
Common Mistakes:
  • Swapping requests and limits
  • Using wrong units like 'MB' instead of 'Mi'
  • Omitting quotes around values
3. Given this Horizontal Pod Autoscaler (HPA) YAML snippet:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50

What happens when CPU usage exceeds 50%?
medium
A. Pods restart automatically
B. The number of pods decreases to 2 to save cost
C. The number of pods increases up to 5 to handle load
D. CPU limits are increased automatically

Solution

  1. Step 1: Understand HPA behavior with CPU utilization

    HPA increases pod count when average CPU usage exceeds target utilization (50%).
  2. Step 2: Check min and max replicas

    Pods scale between 2 and 5 replicas based on load; exceeding 50% triggers scaling up.
  3. Final Answer:

    The number of pods increases up to 5 to handle load -> Option C
  4. Quick Check:

    CPU > 50% triggers scale up = A [OK]
Hint: HPA scales pods up when CPU usage exceeds target [OK]
Common Mistakes:
  • Thinking pods scale down on high CPU
  • Assuming pods restart on high CPU
  • Believing CPU limits auto-increase
4. You notice your Kubernetes cluster is overspending because pods are not scaling down after load decreases. Which is the most likely cause?
medium
A. CPU requests are set higher than limits
B. Resource limits are set too low
C. Pods have no readinessProbe configured
D. The Horizontal Pod Autoscaler has a high minReplicas value

Solution

  1. Step 1: Analyze autoscaling parameters

    A high minReplicas prevents scaling below that number, causing overspending.
  2. Step 2: Evaluate other options

    Low limits or readiness probes don't directly prevent scaling down; CPU requests > limits is invalid.
  3. Final Answer:

    The Horizontal Pod Autoscaler has a high minReplicas value -> Option D
  4. Quick Check:

    High minReplicas blocks scale down = A [OK]
Hint: Check minReplicas to allow scaling down [OK]
Common Mistakes:
  • Confusing limits with requests
  • Ignoring minReplicas effect
  • Assuming readinessProbe affects scaling
5. You want to optimize costs by automatically scaling your Kubernetes cluster nodes based on pod resource usage. Which combination of tools and settings should you use?
hard
A. Cluster Autoscaler with properly set pod resource requests and limits
B. Manual node scaling with no pod resource limits
C. Disable Horizontal Pod Autoscaler and increase node count permanently
D. Set pod resource limits to zero and rely on node autoscaling

Solution

  1. Step 1: Understand cluster autoscaling

    Cluster Autoscaler adjusts node count based on pod scheduling needs and resource requests.
  2. Step 2: Importance of pod resource requests and limits

    Proper requests and limits let the autoscaler know actual resource needs to scale nodes efficiently.
  3. Step 3: Evaluate other options

    Manual scaling wastes resources; disabling HPA or zero limits causes inefficiency or errors.
  4. Final Answer:

    Cluster Autoscaler with properly set pod resource requests and limits -> Option A
  5. Quick Check:

    Autoscaler + resource requests = cost savings [OK]
Hint: Use Cluster Autoscaler plus pod requests/limits for best cost control [OK]
Common Mistakes:
  • Relying on manual scaling only
  • Disabling autoscaling features
  • Setting resource limits to zero