0
0
KubernetesHow-ToBeginner · 4 min read

How to Create Horizontal Pod Autoscaler (HPA) in Kubernetes

To create a HorizontalPodAutoscaler (HPA) in Kubernetes, use the kubectl autoscale command or define an HPA YAML manifest specifying the target deployment, metrics, and scaling limits. The HPA automatically adjusts the number of pod replicas based on observed CPU utilization or custom metrics.
📐

Syntax

The basic syntax to create an HPA using kubectl is:

  • kubectl autoscale deployment <deployment-name> --cpu-percent=<target-cpu-utilization> --min=<min-pods> --max=<max-pods>
  • Or define an HPA YAML manifest with fields like apiVersion, kind, metadata, spec including scaleTargetRef, minReplicas, maxReplicas, and metrics.

This tells Kubernetes which deployment to watch and how to scale pods based on CPU or other metrics.

bash
kubectl autoscale deployment <deployment-name> --cpu-percent=<target-cpu-utilization> --min=<min-pods> --max=<max-pods>
💻

Example

This example creates an HPA for a deployment named myapp that scales pods between 2 and 5 replicas to keep CPU usage around 50%.

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
Output
horizontalpodautoscaler.autoscaling/myapp-hpa created
⚠️

Common Pitfalls

  • Not having metrics-server installed causes HPA to fail because it cannot get CPU metrics.
  • Setting minReplicas higher than maxReplicas causes errors.
  • Using incorrect scaleTargetRef name or kind will prevent HPA from working.
  • For custom metrics, missing proper metric configuration leads to no scaling.

Always verify your cluster has metrics-server running and check HPA status with kubectl get hpa.

bash
kubectl get hpa
NAME       REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
myapp-hpa  Deployment/myapp  45%/50%   2         5         3          10m
Output
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp-hpa Deployment/myapp 45%/50% 2 5 3 10m
📊

Quick Reference

FieldDescription
scaleTargetRefThe deployment or resource to autoscale
minReplicasMinimum number of pod replicas
maxReplicasMaximum number of pod replicas
metricsMetrics to base scaling on (CPU, memory, custom)
kubectl autoscaleCommand to quickly create HPA from CLI

Key Takeaways

Use kubectl autoscale or YAML manifest to create an HPA targeting a deployment.
Ensure metrics-server is installed to provide CPU metrics for HPA.
Set minReplicas and maxReplicas correctly to avoid configuration errors.
Check HPA status with kubectl get hpa to monitor scaling behavior.
Custom metrics require additional setup beyond basic CPU utilization.