0
0
Kubernetesdevops~5 mins

Horizontal Pod Autoscaler in Kubernetes - Commands & Configuration

Choose your learning style9 modes available
Introduction
Sometimes your app needs more power when many users visit, and less power when few users visit. Horizontal Pod Autoscaler automatically adds or removes copies of your app to handle this change smoothly.
When your app gets more traffic during the day and less at night, and you want it to adjust automatically.
When you want to save money by not running too many app copies when they are not needed.
When you want to keep your app fast and responsive even if many people use it at once.
When you want to avoid manually changing the number of app copies every time traffic changes.
Config File - hpa.yaml
hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: example-deployment
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

This file tells Kubernetes to watch the CPU use of pods in the deployment named example-deployment. It will keep at least 2 pods and at most 5 pods. If CPU use goes above 50%, it adds more pods to share the work.

Commands
This command creates the Horizontal Pod Autoscaler in your Kubernetes cluster using the configuration file.
Terminal
kubectl apply -f hpa.yaml
Expected OutputExpected
horizontalpodautoscaler.autoscaling/example-hpa created
This command shows the current status of the autoscaler, including how many pods it wants and the current CPU usage.
Terminal
kubectl get hpa example-hpa
Expected OutputExpected
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE example-hpa Deployment/example-deployment 30%/50% 2 5 2 1m
This command gives detailed information about the autoscaler, including events and scaling decisions.
Terminal
kubectl describe hpa example-hpa
Expected OutputExpected
Name: example-hpa Namespace: default Labels: <none> Annotations: <none> CreationTimestamp: Thu, 01 Jun 2023 12:00:00 +0000 Reference: Deployment/example-deployment Metrics: ( current / target ) resource cpu on pods (as a percentage of request): 30% / 50% Min replicas: 2 Max replicas: 5 Replicas: 2 Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 1m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization below target
Key Concept

If you remember nothing else from this pattern, remember: Horizontal Pod Autoscaler automatically adjusts the number of app copies based on real-time CPU use to keep your app fast and efficient.

Common Mistakes
Setting minReplicas higher than maxReplicas
Kubernetes cannot scale properly if the minimum number of pods is more than the maximum allowed.
Always set minReplicas less than or equal to maxReplicas.
Not defining resource requests in the deployment pods
Autoscaler uses resource requests to calculate usage percentages; without them, scaling won't work correctly.
Make sure your deployment pods specify CPU requests under resources.
Applying HPA before the target deployment exists
The autoscaler needs the deployment to exist to watch it; otherwise, it will fail or show errors.
Create the deployment first, then apply the HPA.
Summary
Create a Horizontal Pod Autoscaler with a YAML file specifying the target deployment, min and max pods, and CPU usage target.
Apply the HPA configuration using kubectl apply -f to enable automatic scaling.
Check the autoscaler's status and details with kubectl get hpa and kubectl describe hpa commands.