Bird
Raised Fist0
Microservicessystem_design~10 mins

Pods and deployments for services in Microservices - Scalability & System Analysis

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Scalability Analysis - Pods and deployments for services
Growth Table: Pods and Deployments for Services
UsersPodsDeploymentsLoad BalancingResource UsageObservability
100 users1-2 pods per serviceSingle deployment per serviceSimple service discoveryLow CPU and memoryBasic logging and metrics
10,000 users5-10 pods per serviceMultiple deployments for canary and blue-greenCluster IP and ingress controllersModerate CPU and memory, autoscaling startsCentralized logging and monitoring
1,000,000 users50-100 pods per serviceMultiple deployments with rollout strategiesAdvanced ingress, service mesh for traffic controlHigh CPU, memory; horizontal pod autoscalingDistributed tracing, alerting, dashboards
100,000,000 usersThousands of pods across clustersMulti-cluster deployments, global rolloutMulti-cluster service mesh, global load balancingVery high resource usage; cluster autoscalingAI-driven monitoring, anomaly detection
First Bottleneck

The first bottleneck is usually the control plane of the orchestration system (like Kubernetes). As the number of pods and deployments grows, the API server and scheduler can become overwhelmed managing state and scheduling pods.

Also, node resources (CPU, memory) limit how many pods can run on a single machine. When pods exceed node capacity, scheduling delays and resource contention occur.

Scaling Solutions
  • Horizontal scaling: Add more nodes to the cluster to run more pods.
  • Cluster autoscaling: Automatically add or remove nodes based on pod demand.
  • Control plane scaling: Use high-availability Kubernetes control plane with multiple API servers.
  • Namespace and deployment partitioning: Split services into namespaces or multiple clusters to reduce control plane load.
  • Pod autoscaling: Use Horizontal Pod Autoscaler (HPA) to adjust pod count based on CPU or custom metrics.
  • Service mesh: Manage traffic routing and observability efficiently at scale.
  • Efficient resource requests and limits: Prevent resource contention by setting proper CPU and memory limits.
Back-of-Envelope Cost Analysis
  • At 10,000 users, expect ~5-10 pods per service, each pod using ~0.5 CPU and 512MB RAM.
  • At 1 million users, 50-100 pods per service, total CPU ~25-50 cores, RAM ~25-50 GB per service.
  • API server can handle ~1000-2000 pod lifecycle events per second; exceeding this causes delays.
  • Network bandwidth per node depends on pod traffic; 1 Gbps network supports ~125 MB/s.
  • Storage for logs and metrics grows with pod count; consider centralized solutions with retention policies.
Interview Tip

Start by explaining how pods and deployments work at small scale. Then discuss what changes as user load grows. Identify the first bottleneck clearly (control plane or node resources). Propose specific scaling solutions like autoscaling and multi-cluster setups. Use numbers to support your points. Finally, mention monitoring and observability as critical for managing scale.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Since the database is the bottleneck, first add read replicas to distribute read traffic and implement caching to reduce load. Also, optimize queries and consider sharding if writes grow significantly.

Key Result
Pods and deployments scale by adding more pods and nodes, but the orchestration control plane and node resources become bottlenecks first. Autoscaling, multi-cluster setups, and efficient resource management are key to scaling services reliably.

Practice

(1/5)
1. What is the main role of a Pod in a microservices architecture?
easy
A. To manage updates and scaling of containers
B. To run one or more containers together as a single unit
C. To route network traffic between services
D. To store persistent data for containers

Solution

  1. Step 1: Understand what a Pod is

    A Pod is the smallest deployable unit in Kubernetes that runs one or more containers together.
  2. Step 2: Differentiate Pod from other components

    Deployments manage Pods, Services route traffic, and persistent storage is handled separately.
  3. Final Answer:

    To run one or more containers together as a single unit -> Option B
  4. Quick Check:

    Pod = container unit [OK]
Hint: Pods run containers; deployments manage pods [OK]
Common Mistakes:
  • Confusing Pods with Deployments
  • Thinking Pods handle networking
  • Assuming Pods store data
2. Which of the following is the correct YAML snippet to define a Deployment that runs 3 replicas of a Pod?
easy
A. kind: Pod\nreplicas: 3\nmetadata:\n name: my-pod
B. replicas: 3\nkind: Service\nmetadata:\n name: my-service
C. replicas: 3\nkind: Deployment\nmetadata:\n name: my-deployment
D. kind: Deployment\nmetadata:\n name: my-deployment\nreplicas: three

Solution

  1. Step 1: Identify correct kind and replicas field

    Deployment kind is correct and replicas should be a number, here 3.
  2. Step 2: Check metadata and syntax

    Metadata name is valid; 'replicas: three' is invalid because replicas must be numeric.
  3. Final Answer:

    replicas: 3\nkind: Deployment\nmetadata:\n name: my-deployment -> Option C
  4. Quick Check:

    Deployment with numeric replicas = correct YAML [OK]
Hint: Deployments use 'kind: Deployment' and numeric replicas [OK]
Common Mistakes:
  • Using 'kind: Pod' instead of Deployment
  • Setting replicas as a word instead of number
  • Confusing Service with Deployment
3. Given this Deployment YAML snippet, how many Pods will be running after applying it?
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 4
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web-container
        image: nginx
medium
A. 4 Pods
B. 0 Pods until manually started
C. 1 Pod
D. Depends on the number of nodes

Solution

  1. Step 1: Read replicas count in Deployment spec

    The replicas field is set to 4, meaning Kubernetes will maintain 4 Pods.
  2. Step 2: Understand Deployment behavior

    Deployment automatically creates and manages the specified number of Pods.
  3. Final Answer:

    4 Pods -> Option A
  4. Quick Check:

    replicas = 4 Pods running [OK]
Hint: replicas number = Pods count after deployment [OK]
Common Mistakes:
  • Assuming only 1 Pod runs by default
  • Thinking Pods need manual start
  • Confusing nodes with Pod count
4. You applied a Deployment YAML but notice no Pods are running. Which is the most likely cause?
apiVersion: apps/v1 kind: Deployment metadata: name: api-server spec: replicas: 3 selector: matchLabels: app: api template: metadata: labels: app: backend spec: containers: - name: api-container image: myapi:latest
medium
A. The Deployment kind is incorrect
B. The replicas count is too high for the cluster
C. The container image name is invalid
D. The selector labels do not match the Pod template labels

Solution

  1. Step 1: Compare selector and template labels

    The selector uses label 'app: api' but the Pod template labels 'app: backend' which do not match.
  2. Step 2: Understand label matching importance

    Deployment uses selector to manage Pods; mismatch means no Pods are controlled or created.
  3. Final Answer:

    The selector labels do not match the Pod template labels -> Option D
  4. Quick Check:

    Selector labels must match Pod labels [OK]
Hint: Selector and Pod labels must match exactly [OK]
Common Mistakes:
  • Ignoring label mismatch
  • Assuming image name causes no Pods
  • Thinking replicas count blocks Pod creation
5. You want to update a microservice with zero downtime using Kubernetes. Which approach best uses Pods and Deployments to achieve this?
hard
A. Update the Deployment with a new image version; Kubernetes creates new Pods and gradually replaces old ones
B. Delete all old Pods manually and then create new Pods with the updated image
C. Scale down the Deployment to zero replicas, then scale up with the new image
D. Create a new Deployment with the updated image and delete the old Deployment immediately

Solution

  1. Step 1: Understand Deployment update strategy

    Deployments support rolling updates that create new Pods and remove old Pods gradually.
  2. Step 2: Compare options for zero downtime

    Manual deletion or scaling down causes downtime; creating new Deployment causes conflicts.
  3. Final Answer:

    Update the Deployment with a new image version; Kubernetes creates new Pods and gradually replaces old ones -> Option A
  4. Quick Check:

    Rolling update = zero downtime update [OK]
Hint: Use Deployment rolling updates for zero downtime [OK]
Common Mistakes:
  • Deleting Pods manually causing downtime
  • Scaling to zero causes service interruption
  • Creating new Deployment causes conflicts