Kubernetesdevops~5 mins

Horizontal Pod Autoscaler in Kubernetes - Commands & Configuration

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Sometimes your app needs more power when many users visit, and less power when few users visit. Horizontal Pod Autoscaler automatically adds or removes copies of your app to handle this change smoothly.

When your app gets more traffic during the day and less at night, and you want it to adjust automatically.

When you want to save money by not running too many app copies when they are not needed.

When you want to keep your app fast and responsive even if many people use it at once.

When you want to avoid manually changing the number of app copies every time traffic changes.

Config File - hpa.yaml

hpa.yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: example-deployment
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

This file tells Kubernetes to watch the CPU use of pods in the deployment named example-deployment. It will keep at least 2 pods and at most 5 pods. If CPU use goes above 50%, it adds more pods to share the work.

Commands

This command creates the Horizontal Pod Autoscaler in your Kubernetes cluster using the configuration file.

Terminal

kubectl apply -f hpa.yaml

Expected OutputExpected

horizontalpodautoscaler.autoscaling/example-hpa created

This command shows the current status of the autoscaler, including how many pods it wants and the current CPU usage.

Terminal

kubectl get hpa example-hpa

Expected OutputExpected

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE example-hpa Deployment/example-deployment 30%/50% 2 5 2 1m

This command gives detailed information about the autoscaler, including events and scaling decisions.

Terminal

kubectl describe hpa example-hpa

Expected OutputExpected

Name: example-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: Thu, 01 Jun 2023 12:00:00 +0000 Reference: Deployment/example-deployment Metrics: ( current / target ) resource cpu on pods (as a percentage of request): 30% / 50% Min replicas: 2 Max replicas: 5 Replicas: 2 Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 1m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization below target

Key Concept

If you remember nothing else from this pattern, remember: Horizontal Pod Autoscaler automatically adjusts the number of app copies based on real-time CPU use to keep your app fast and efficient.

Common Mistakes

Setting minReplicas higher than maxReplicas

Kubernetes cannot scale properly if the minimum number of pods is more than the maximum allowed.

Always set minReplicas less than or equal to maxReplicas.

Not defining resource requests in the deployment pods

Autoscaler uses resource requests to calculate usage percentages; without them, scaling won't work correctly.

Make sure your deployment pods specify CPU requests under resources.

Applying HPA before the target deployment exists

The autoscaler needs the deployment to exist to watch it; otherwise, it will fail or show errors.

Create the deployment first, then apply the HPA.

Summary

Create a Horizontal Pod Autoscaler with a YAML file specifying the target deployment, min and max pods, and CPU usage target.

Apply the HPA configuration using kubectl apply -f to enable automatic scaling.

Check the autoscaler's status and details with kubectl get hpa and kubectl describe hpa commands.