0
0
Microservicessystem_design~25 mins

Why Kubernetes manages microservice deployment in Microservices - Design It to Understand It

Choose your learning style9 modes available
Design: Microservice Deployment Management with Kubernetes
Focus on deployment, scaling, and management of microservices using Kubernetes. Out of scope are microservice internal design and business logic.
Functional Requirements
FR1: Deploy multiple microservices independently
FR2: Ensure high availability and fault tolerance
FR3: Scale microservices automatically based on load
FR4: Manage service discovery and load balancing
FR5: Handle rolling updates without downtime
FR6: Monitor health and restart failed services automatically
Non-Functional Requirements
NFR1: Support at least 100 microservice instances concurrently
NFR2: API response latency p99 under 200ms
NFR3: Availability target of 99.9% uptime
NFR4: Deployment changes should not cause downtime
NFR5: Support heterogeneous microservices (different languages and runtimes)
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Container runtime (e.g., Docker)
Kubernetes control plane (API server, scheduler, controller manager)
Kubernetes worker nodes
Pods and ReplicaSets
Services and Ingress controllers
ConfigMaps and Secrets
Horizontal Pod Autoscaler
Health probes (liveness and readiness)
Design Patterns
Declarative configuration management
Rolling updates and rollbacks
Self-healing and auto-restart
Service discovery and load balancing
Autoscaling based on metrics
Reference Architecture
User --> Kubernetes API Server --> Scheduler --> Worker Nodes (Pods running microservices)
Worker Nodes --> Kubelet (manages containers)
Pods --> Services (for load balancing and discovery)
Horizontal Pod Autoscaler monitors metrics --> scales Pods
Health Probes --> Kubelet restarts unhealthy Pods
Components
Kubernetes API Server
Kubernetes
Central control point for managing cluster state and deployments
Scheduler
Kubernetes
Assigns Pods to worker nodes based on resource availability
Worker Nodes
Linux servers with container runtime
Run microservice containers inside Pods
Pods
Kubernetes
Smallest deployable unit, runs one or more containers
Services
Kubernetes
Provide stable network endpoints and load balancing for Pods
Horizontal Pod Autoscaler
Kubernetes
Automatically scale Pods based on CPU or custom metrics
Health Probes
Kubernetes
Check container health and readiness to restart or route traffic
Request Flow
1. User sends request to microservice via Service endpoint
2. Service load balances request to one of the healthy Pods
3. Pod runs containerized microservice to handle request
4. Kubelet on worker node monitors Pod health using probes
5. If Pod is unhealthy, Kubelet restarts it automatically
6. Horizontal Pod Autoscaler monitors metrics and adjusts Pod count
7. Scheduler places new Pods on nodes with available resources
8. Kubernetes API Server manages desired state and updates
Database Schema
Not applicable as Kubernetes manages deployment and runtime, not data storage.
Scaling Discussion
Bottlenecks
API Server overload with too many requests
Scheduler delays when cluster size grows
Worker node resource exhaustion
Network bottlenecks between services
Autoscaler reacting slowly to traffic spikes
Solutions
Use multiple API Server replicas behind a load balancer
Optimize scheduler with custom policies or multiple schedulers
Add more worker nodes and use resource quotas
Implement network policies and use efficient CNI plugins
Tune autoscaler thresholds and use predictive scaling
Interview Tips
Time: Spend 10 minutes explaining Kubernetes components and their roles, 15 minutes on deployment and scaling flow, 10 minutes on handling failures and updates, 10 minutes on scaling challenges and solutions
Kubernetes provides declarative deployment and self-healing
Pods are the smallest deployable units running containers
Services enable stable networking and load balancing
Autoscaling adjusts capacity automatically based on load
Rolling updates avoid downtime during deployments
Health probes ensure only healthy Pods serve traffic
Scaling requires addressing API server, scheduler, and node limits