GcpHow-ToBeginner · 4 min read

How to Auto Scale GKE in GCP: Step-by-Step Guide

To auto scale GKE in GCP, enable the Cluster Autoscaler to automatically adjust the number of nodes based on workload demand, and configure the Horizontal Pod Autoscaler to scale pods based on CPU or custom metrics. These features ensure your cluster grows or shrinks smoothly without manual intervention.

📐

Syntax

The main components for auto scaling in GKE are:

Cluster Autoscaler: Adjusts the number of nodes in your node pool automatically.
Horizontal Pod Autoscaler (HPA): Scales the number of pods in a deployment based on CPU or custom metrics.

Basic commands and configuration include:

gcloud container clusters create [CLUSTER_NAME] \
  --enable-autoscaling \
  --min-nodes=[MIN_NODES] \
  --max-nodes=[MAX_NODES] \
  --zone=[ZONE]

kubectl autoscale deployment [DEPLOYMENT_NAME] \
  --min=[MIN_PODS] \
  --max=[MAX_PODS] \
  --cpu-percent=[TARGET_CPU_UTILIZATION]

Replace placeholders with your values.

bash

gcloud container clusters create my-cluster \
  --enable-autoscaling \
  --min-nodes=1 \
  --max-nodes=5 \
  --zone=us-central1-a

kubectl autoscale deployment my-app \
  --min=2 \
  --max=10 \
  --cpu-percent=60

💻

Example

This example creates a GKE cluster with autoscaling enabled for the node pool and sets up a Horizontal Pod Autoscaler for a deployment named my-app. The cluster will have between 1 and 5 nodes, and the deployment will scale pods between 2 and 10 based on 60% CPU usage.

bash

gcloud container clusters create my-cluster \
  --enable-autoscaling \
  --min-nodes=1 \
  --max-nodes=5 \
  --zone=us-central1-a

kubectl create deployment my-app --image=nginx

kubectl autoscale deployment my-app \
  --min=2 \
  --max=10 \
  --cpu-percent=60

kubectl get hpa

Output

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE my-app Deployment/my-app 0%/60% 2 10 2 1m

⚠️

Common Pitfalls

Not enabling autoscaling on node pools: The cluster autoscaler only works if enabled on node pools.
Setting min and max nodes or pods too close: This can cause scaling to be ineffective or cause resource shortages.
Ignoring resource requests: Pods must have CPU/memory requests set for autoscaling to work properly.
Using outdated kubectl versions: Ensure your kubectl matches your cluster version to avoid command errors.

bash

kubectl autoscale deployment my-app --min=1 --max=1 --cpu-percent=50  # Wrong: min and max are the same, no scaling

kubectl autoscale deployment my-app --min=2 --max=10 --cpu-percent=60  # Right: allows scaling between 2 and 10 pods

📊

Quick Reference

Remember these key points for GKE autoscaling:

Enable --enable-autoscaling on node pools during cluster creation or update.
Set sensible min-nodes and max-nodes limits to control costs and capacity.
Use kubectl autoscale to create Horizontal Pod Autoscalers for your deployments.
Ensure pods have resource requests defined for autoscaling to work.
Monitor autoscaling behavior with kubectl get hpa and GCP Console.

✅

Key Takeaways

Enable Cluster Autoscaler on your GKE node pools to automatically adjust node count.

Use Horizontal Pod Autoscaler to scale pods based on CPU or custom metrics.

Set proper min and max limits for nodes and pods to balance cost and performance.

Define CPU and memory requests in pod specs for autoscaling to function correctly.

Monitor autoscaling status regularly using kubectl and GCP Console.