How to scale based on cpu usage in kubernetes

KubernetesHow-ToBeginner · 4 min read

How to Scale Kubernetes Pods Based on CPU Usage

In Kubernetes, you can scale pods based on CPU usage by using the HorizontalPodAutoscaler (HPA) resource. HPA monitors CPU metrics and adjusts the number of pod replicas automatically to keep CPU usage near a target percentage.

📐

Syntax

The HorizontalPodAutoscaler resource defines how to scale pods based on CPU usage. Key parts include:

apiVersion: API group and version.
kind: Always HorizontalPodAutoscaler.
metadata.name: Name of the HPA object.
spec.scaleTargetRef: The deployment or pod to scale.
spec.minReplicas and spec.maxReplicas: Minimum and maximum pod counts.
spec.metrics: Metrics to use for scaling, such as CPU usage.

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: example-deployment
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

💻

Example

This example creates an HPA for a deployment named example-deployment. It keeps CPU usage around 50% by scaling pods between 1 and 5 replicas automatically.

bash

kubectl apply -f - <<EOF
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: example-deployment
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
EOF

kubectl get hpa example-hpa

Output

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE example-hpa Deployment/example-deployment 45%/50% 1 5 2 1m

⚠️

Common Pitfalls

Common mistakes when scaling based on CPU usage include:

Not setting resource requests on pods, so CPU usage metrics are inaccurate.
Using autoscaling/v1 API which only supports CPU, missing newer metric types.
Setting minReplicas and maxReplicas too close, limiting scaling.
Not having metrics-server installed, so CPU metrics are unavailable.

yaml

### Wrong: No resource requests set
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-deployment
spec:
  replicas: 1
  template:
    spec:
      containers:
      - name: app
        image: nginx
        # Missing resources.requests.cpu

### Right: Set CPU requests for accurate metrics
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-deployment
spec:
  replicas: 1
  template:
    spec:
      containers:
      - name: app
        image: nginx
        resources:
          requests:
            cpu: 100m

📊

Quick Reference

Tips for scaling based on CPU usage in Kubernetes:

Always set CPU requests in pod specs for reliable metrics.
Use autoscaling/v2 API for more metric options.
Install metrics-server in your cluster to provide CPU data.
Set reasonable minReplicas and maxReplicas values.
Check HPA status with kubectl get hpa.

✅

Key Takeaways

Use HorizontalPodAutoscaler (HPA) to scale pods automatically based on CPU usage.

Set CPU resource requests in pod specs to enable accurate CPU metrics.

Install metrics-server in your cluster to provide CPU usage data for HPA.

Configure minReplicas and maxReplicas to control scaling limits.

Use autoscaling/v2 API version for flexible and modern scaling options.