How to Scale Kubernetes Pods Based on Memory Usage
To scale pods based on memory usage in Kubernetes, use the
HorizontalPodAutoscaler resource with memory as the target metric. This requires metrics-server installed to provide memory usage data, and you define the memory utilization threshold in the autoscaler spec.Syntax
The HorizontalPodAutoscaler (HPA) resource defines how Kubernetes scales pods automatically. Key parts include:
apiVersionandkind: Define the resource type.metadata.name: Name of the HPA.spec.scaleTargetRef: The deployment or pod to scale.spec.minReplicasandspec.maxReplicas: Minimum and maximum pod counts.spec.metrics: Defines the metric type and target value, e.g., memory usage.
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: example-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70Example
This example shows an HPA that scales a deployment named web-app between 1 and 4 pods based on average memory usage reaching 60%.
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-memory-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 1
maxReplicas: 4
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 60Output
horizontalpodautoscaler.autoscaling/web-app-memory-hpa created
Common Pitfalls
- Metrics-server not installed: Memory metrics won't be available without it.
- Incorrect metric type: Using
cpuinstead ofmemorywhen you want memory-based scaling. - Resource requests missing: Pods must have memory requests set for utilization metrics to work.
- Too narrow thresholds: Setting very low or high utilization can cause flapping or no scaling.
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: wrong-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 10 # Too low, causes constant scaling
# Corrected version:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: correct-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70Quick Reference
Remember these key points when scaling based on memory usage:
- Install
metrics-serverto provide memory metrics. - Set
memoryas the resource metric in HPA. - Ensure pods have memory
requestsdefined. - Choose a reasonable
averageUtilizationpercentage (50-80%). - Set sensible
minReplicasandmaxReplicaslimits.
Key Takeaways
Use HorizontalPodAutoscaler with memory resource metrics to scale pods based on memory usage.
Ensure metrics-server is installed and pods have memory requests set for accurate scaling.
Set averageUtilization to a balanced value to avoid frequent scaling changes.
Define clear minReplicas and maxReplicas to control scaling boundaries.
Verify your HPA configuration with kubectl after applying to confirm it works.