What is Horizontal Pod Autoscaler in Kubernetes: Explained Simply
Horizontal Pod Autoscaler (HPA) in Kubernetes automatically adjusts the number of pods in a deployment based on observed CPU usage or other metrics. It helps keep your application responsive by scaling out when demand increases and scaling in when demand decreases.How It Works
Think of the Horizontal Pod Autoscaler as a smart helper that watches how busy your app is. If your app gets more visitors and the current pods are working hard, the autoscaler adds more pods to share the load. When fewer people use the app, it reduces the number of pods to save resources.
It checks metrics like CPU usage or custom signals at regular intervals. If the average CPU usage goes above a set target, it increases pods; if it drops below, it decreases pods. This way, your app stays fast without wasting resources.
Example
This example shows how to create a Horizontal Pod Autoscaler that keeps the average CPU usage at 50% for a deployment named my-app. It will scale between 1 and 5 pods automatically.
kubectl autoscale deployment my-app --cpu-percent=50 --min=1 --max=5
When to Use
Use a Horizontal Pod Autoscaler when your app experiences changing traffic or workload. It is perfect for web apps, APIs, or services where demand can spike or drop unpredictably.
For example, an online store might get many visitors during sales and fewer at night. HPA helps by adding pods during busy times and reducing them when traffic is low, saving money and keeping the app responsive.
Key Points
- HPA automatically adjusts pod count based on metrics like CPU usage.
- It helps balance performance and resource cost.
- You set minimum and maximum pod limits to control scaling.
- Works well for apps with variable workloads.