How to Scale Kubernetes Pods Based on CPU Usage
In Kubernetes, you can scale pods based on CPU usage by using the
HorizontalPodAutoscaler (HPA) resource. HPA monitors CPU metrics and adjusts the number of pod replicas automatically to keep CPU usage near a target percentage.Syntax
The HorizontalPodAutoscaler resource defines how to scale pods based on CPU usage. Key parts include:
apiVersion: API group and version.kind: AlwaysHorizontalPodAutoscaler.metadata.name: Name of the HPA object.spec.scaleTargetRef: The deployment or pod to scale.spec.minReplicasandspec.maxReplicas: Minimum and maximum pod counts.spec.metrics: Metrics to use for scaling, such as CPU usage.
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: example-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50Example
This example creates an HPA for a deployment named example-deployment. It keeps CPU usage around 50% by scaling pods between 1 and 5 replicas automatically.
bash
kubectl apply -f - <<EOF
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: example-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
EOF
kubectl get hpa example-hpaOutput
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
example-hpa Deployment/example-deployment 45%/50% 1 5 2 1m
Common Pitfalls
Common mistakes when scaling based on CPU usage include:
- Not setting resource requests on pods, so CPU usage metrics are inaccurate.
- Using
autoscaling/v1API which only supports CPU, missing newer metric types. - Setting
minReplicasandmaxReplicastoo close, limiting scaling. - Not having metrics-server installed, so CPU metrics are unavailable.
yaml
### Wrong: No resource requests set apiVersion: apps/v1 kind: Deployment metadata: name: example-deployment spec: replicas: 1 template: spec: containers: - name: app image: nginx # Missing resources.requests.cpu ### Right: Set CPU requests for accurate metrics apiVersion: apps/v1 kind: Deployment metadata: name: example-deployment spec: replicas: 1 template: spec: containers: - name: app image: nginx resources: requests: cpu: 100m
Quick Reference
Tips for scaling based on CPU usage in Kubernetes:
- Always set CPU
requestsin pod specs for reliable metrics. - Use
autoscaling/v2API for more metric options. - Install
metrics-serverin your cluster to provide CPU data. - Set reasonable
minReplicasandmaxReplicasvalues. - Check HPA status with
kubectl get hpa.
Key Takeaways
Use HorizontalPodAutoscaler (HPA) to scale pods automatically based on CPU usage.
Set CPU resource requests in pod specs to enable accurate CPU metrics.
Install metrics-server in your cluster to provide CPU usage data for HPA.
Configure minReplicas and maxReplicas to control scaling limits.
Use autoscaling/v2 API version for flexible and modern scaling options.