Kubernetesdevops~10 mins

Horizontal Pod Autoscaler in Kubernetes - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Process Flow - Horizontal Pod Autoscaler

Monitor Metrics

↓

Compare Current Load with Target

↓

Decide to Scale Up or Down

↓

Increase

↓

Change Pod Count

↓

Wait & Repeat

The Horizontal Pod Autoscaler watches metrics, compares them to targets, then adjusts pod count up or down accordingly, repeating continuously.

Execution Sample

Kubernetes

kubectl autoscale deployment myapp --min=2 --max=5 --cpu-percent=50

This command creates an autoscaler for 'myapp' deployment to keep CPU usage near 50%, scaling pods between 2 and 5.

Process Table

Step	Current CPU %	Target CPU %	Pods Before	Decision	Pods After
1	30	50	2	CPU below target, no scale change	2
2	55	50	2	CPU above target, scale up	3
3	70	50	3	CPU well above target, scale up	4
4	45	50	4	CPU below target, scale down	3
5	20	50	3	CPU well below target, scale down	2
6	50	50	2	CPU at target, no change	2

💡 Autoscaler continuously monitors; this trace shows decisions over 6 checks.

Status Tracker

Variable	Start	After 1	After 2	After 3	After 4	After 5	After 6
Current CPU %	N/A	30	55	70	45	20	50
Pods Count	2	2	3	4	3	2	2

Key Moments - 3 Insights

Why doesn't the pod count change when CPU is below target but not very low (Step 1)?

Why does the pod count increase by only one pod at each scale-up step?

Can the pod count go below the minimum set (2 pods)?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution table, what is the pod count after Step 3?

Concept Snapshot

Horizontal Pod Autoscaler (HPA):
- Monitors pod metrics (e.g., CPU usage).
- Compares current metric to target.
- Scales pod count up/down within min/max limits.
- Runs continuously to keep app responsive.
- Command example: kubectl autoscale deployment NAME --min=X --max=Y --cpu-percent=Z

Full Transcript

The Horizontal Pod Autoscaler watches the CPU usage of pods in a deployment. It compares the current CPU percentage to a target value. If the CPU is higher than the target, it increases the number of pods by one to handle more load. If the CPU is lower than the target, it decreases the number of pods by one to save resources, but never below the minimum set. This process repeats continuously to keep the application running efficiently. The example command sets up autoscaling for a deployment named 'myapp' with a minimum of 2 pods, maximum of 5 pods, and a target CPU usage of 50%. The execution table shows how the pod count changes step-by-step based on CPU usage readings.