How to Create Horizontal Pod Autoscaler (HPA) in Kubernetes
To create a
HorizontalPodAutoscaler (HPA) in Kubernetes, use the kubectl autoscale command or define an HPA YAML manifest specifying the target deployment, metrics, and scaling limits. The HPA automatically adjusts the number of pod replicas based on observed CPU utilization or custom metrics.Syntax
The basic syntax to create an HPA using kubectl is:
kubectl autoscale deployment <deployment-name> --cpu-percent=<target-cpu-utilization> --min=<min-pods> --max=<max-pods>- Or define an HPA YAML manifest with fields like
apiVersion,kind,metadata,specincludingscaleTargetRef,minReplicas,maxReplicas, andmetrics.
This tells Kubernetes which deployment to watch and how to scale pods based on CPU or other metrics.
bash
kubectl autoscale deployment <deployment-name> --cpu-percent=<target-cpu-utilization> --min=<min-pods> --max=<max-pods>
Example
This example creates an HPA for a deployment named myapp that scales pods between 2 and 5 replicas to keep CPU usage around 50%.
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50Output
horizontalpodautoscaler.autoscaling/myapp-hpa created
Common Pitfalls
- Not having metrics-server installed causes HPA to fail because it cannot get CPU metrics.
- Setting
minReplicashigher thanmaxReplicascauses errors. - Using incorrect
scaleTargetRefname or kind will prevent HPA from working. - For custom metrics, missing proper metric configuration leads to no scaling.
Always verify your cluster has metrics-server running and check HPA status with kubectl get hpa.
bash
kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE myapp-hpa Deployment/myapp 45%/50% 2 5 3 10m
Output
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
myapp-hpa Deployment/myapp 45%/50% 2 5 3 10m
Quick Reference
| Field | Description |
|---|---|
| scaleTargetRef | The deployment or resource to autoscale |
| minReplicas | Minimum number of pod replicas |
| maxReplicas | Maximum number of pod replicas |
| metrics | Metrics to base scaling on (CPU, memory, custom) |
| kubectl autoscale | Command to quickly create HPA from CLI |
Key Takeaways
Use kubectl autoscale or YAML manifest to create an HPA targeting a deployment.
Ensure metrics-server is installed to provide CPU metrics for HPA.
Set minReplicas and maxReplicas correctly to avoid configuration errors.
Check HPA status with kubectl get hpa to monitor scaling behavior.
Custom metrics require additional setup beyond basic CPU utilization.