Imagine you have a service running in Kubernetes. The Horizontal Pod Autoscaler (HPA) adjusts the number of pods based on certain metrics. Which metric does HPA primarily use to decide when to add or remove pods?
Think about what resource usage usually indicates workload intensity.
The HPA primarily monitors CPU utilization (or other specified metrics) to decide if more pods are needed to handle the load or if some pods can be removed to save resources.
In Kubernetes architecture, which component actively monitors metrics and adjusts the number of pods for Horizontal Pod Autoscaler?
Think about which component manages controllers and their loops.
The kube-controller-manager runs controllers including the Horizontal Pod Autoscaler controller, which monitors metrics and updates pod counts accordingly.
Consider a Kubernetes cluster where the metrics server stops responding. What is the expected behavior of the Horizontal Pod Autoscaler during this period?
Think about how autoscaling depends on metrics and what happens if metrics are missing.
Without metrics, HPA cannot make informed scaling decisions, so it maintains the current pod count until metrics become available again.
If you configure the Horizontal Pod Autoscaler with a very low CPU utilization target (e.g., 10%), what is a likely tradeoff?
Consider what happens if the threshold to add pods is very low.
A low CPU target causes HPA to add pods more aggressively, which improves responsiveness but increases resource usage and cost.
Your service currently runs 5 pods, each with 50% CPU utilization. The target CPU utilization for HPA is 40%. If the incoming load doubles, approximately how many pods will HPA scale to maintain the target?
Think about how doubling load affects CPU usage and how many pods are needed to keep utilization at 40%.
Doubling load doubles CPU usage per pod from 50% to 100%. To reduce utilization to 40%, pods must increase proportionally: (100% / 40%) * 5 pods = 12.5 pods, rounded to 10 or more. 10 pods is the closest reasonable estimate.