Kubernetesdevops~5 mins

Scaling Deployments in Kubernetes - Commands & Configuration

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Sometimes your app needs more copies running to handle more users or work. Scaling deployments means increasing or decreasing the number of app copies automatically or by command to keep your app fast and available.

When your website gets more visitors and you want it to stay fast without crashing

When you want to save resources by running fewer app copies during quiet times

When you deploy a new version and want to test it with a few copies before full rollout

When you want to handle sudden spikes in traffic without manual intervention

When you want to keep your app running even if some copies fail

Config File - deployment.yaml

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-deployment
  labels:
    app: example-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: example-app
  template:
    metadata:
      labels:
        app: example-app
    spec:
      containers:
      - name: example-container
        image: nginx:1.23.3
        ports:
        - containerPort: 80

This file defines a Deployment named example-deployment that runs 2 copies (replicas) of an Nginx container. The replicas field controls how many copies run. The selector and template ensure the Deployment manages the right pods.

Commands

This command creates the Deployment in Kubernetes using the configuration file. It starts 2 copies of the app as defined.

Terminal

kubectl apply -f deployment.yaml

Expected OutputExpected

deployment.apps/example-deployment created

This command lists all deployments to check if our example-deployment is running and how many replicas it has.

Terminal

kubectl get deployments

Expected OutputExpected

NAME READY UP-TO-DATE AVAILABLE AGE example-deployment 2/2 2 2 10s

This command changes the number of running copies to 5, scaling up the deployment to handle more load.

Terminal

kubectl scale deployment example-deployment --replicas=5

Expected OutputExpected

deployment.apps/example-deployment scaled

→

--replicas - Sets the desired number of pod copies

This command lists all pods with the label app=example-app to verify that 5 pods are running after scaling.

Terminal

kubectl get pods -l app=example-app

Expected OutputExpected

NAME READY STATUS RESTARTS AGE example-deployment-6f7d8c9b7f-abc12 1/1 Running 0 10s example-deployment-6f7d8c9b7f-def34 1/1 Running 0 10s example-deployment-6f7d8c9b7f-ghi56 1/1 Running 0 5s example-deployment-6f7d8c9b7f-jkl78 1/1 Running 0 5s example-deployment-6f7d8c9b7f-mno90 1/1 Running 0 5s

→

-l - Filters pods by label

Key Concept

If you remember nothing else from this pattern, remember: scaling changes how many copies of your app run to match demand and keep it available.

Common Mistakes

Trying to scale a deployment that does not exist

Kubernetes will return an error because it cannot find the deployment to scale.

Make sure the deployment is created first with kubectl apply before scaling.

Setting replicas to zero without intending to stop the app

This stops all copies, making the app unavailable.

Use scaling to reduce replicas carefully and confirm the app can handle zero replicas if needed.

Not using the correct label selector when checking pods

You may not see the pods related to your deployment and think scaling failed.

Use the exact label used in the deployment spec with -l flag to list the right pods.

Summary

Create a deployment with a set number of replicas using a YAML file and kubectl apply.

Check the deployment status and number of running replicas with kubectl get deployments.

Scale the deployment up or down using kubectl scale with the --replicas flag.

Verify the number of running pods matches the desired replicas using kubectl get pods with label filtering.