Bird
Raised Fist0
Microservicessystem_design~25 mins

Blue-green deployment in Microservices - System Design Exercise

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Design: Blue-Green Deployment System
Design the deployment strategy and infrastructure for blue-green deployment of microservices. Exclude CI/CD pipeline details and code-level rollback mechanisms.
Functional Requirements
FR1: Deploy new versions of microservices with zero downtime
FR2: Allow quick rollback to previous version if issues occur
FR3: Minimize risk of deployment failures affecting users
FR4: Support automated traffic switching between versions
FR5: Monitor health of both blue and green environments
Non-Functional Requirements
NFR1: Handle up to 10,000 concurrent users during deployment
NFR2: API response latency p99 under 200ms during deployment
NFR3: Availability target of 99.9% uptime including deployment windows
NFR4: Deployment process should complete within 5 minutes
NFR5: Support multiple microservices independently deployed
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Load balancer or API gateway for traffic routing
Two identical production environments (blue and green)
Health check and monitoring system
Deployment automation tool
Service registry and discovery
Design Patterns
Canary deployment as an alternative
Feature toggles for gradual rollout
Circuit breaker for fault tolerance
Immutable infrastructure for environment consistency
Reference Architecture
          +-------------------+
          |    Users/Clients  |
          +---------+---------+
                    |
                    v
          +---------+---------+
          |  Load Balancer /  |
          |   API Gateway    |
          +----+--------+----+
               |        |
       +-------+        +-------+
       |                        |
+------+-------+        +-------+------+
|   Blue Env   |        |   Green Env  |
| (Current)    |        | (New Version)|
+--------------+        +--------------+
       |                        |
+------+-------+        +-------+------+
| Microservices|        | Microservices|
+--------------+        +--------------+
       |                        |
+------+-------+        +-------+------+
|  Database(s) |        |  Database(s) |
+--------------+        +--------------+
Components
Load Balancer / API Gateway
Nginx, Envoy, or AWS ALB
Route user traffic to either blue or green environment based on deployment state
Blue Environment
Kubernetes cluster or VM instances
Current stable version of microservices serving live traffic
Green Environment
Kubernetes cluster or VM instances
New version of microservices deployed and tested before switching traffic
Health Check and Monitoring
Prometheus, Grafana, or custom scripts
Continuously verify readiness and performance of both environments
Deployment Automation
Jenkins, ArgoCD, or Spinnaker
Automate deployment, testing, and traffic switching processes
Database
Relational or NoSQL DB with versioning support
Store persistent data accessible by both environments, with migration strategy
Request Flow
1. User sends request to Load Balancer/API Gateway.
2. Load Balancer routes request to Blue environment (current live version).
3. Deploy new microservice version to Green environment without affecting Blue.
4. Run automated tests and health checks on Green environment.
5. If Green passes health checks, switch Load Balancer traffic from Blue to Green.
6. Users now receive responses from Green environment.
7. Monitor Green environment closely for errors or performance issues.
8. If issues detected, rollback by switching traffic back to Blue environment.
9. Once Green is stable, Blue environment can be updated for next deployment.
Database Schema
Entities remain consistent across blue and green environments. Database schema changes require backward-compatible migrations. Use versioned migration scripts to update schema without downtime. Both environments connect to the same database instance or cluster to ensure data consistency.
Scaling Discussion
Bottlenecks
Load balancer capacity limits during traffic switch
Database schema changes causing downtime or data inconsistency
Health check delays slowing deployment process
Resource duplication doubling infrastructure costs
Rollback complexity if multiple microservices fail simultaneously
Solutions
Use scalable load balancers with auto-scaling and connection draining
Implement zero-downtime database migrations with feature toggles
Optimize health checks for fast and reliable readiness signals
Use container orchestration to efficiently share resources
Automate rollback procedures and isolate failures per microservice
Interview Tips
Time: Spend 10 minutes understanding requirements and clarifying scope, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.
Explain how blue-green deployment reduces downtime and risk
Describe traffic routing and environment switching clearly
Discuss database migration challenges and solutions
Highlight monitoring and rollback strategies
Mention alternatives like canary deployments and feature toggles

Practice

(1/5)
1. What is the main purpose of blue-green deployment in microservices?
easy
A. To improve database query speed
B. To increase the number of microservices in the system
C. To reduce downtime by switching traffic between two identical environments
D. To simplify the codebase by merging services

Solution

  1. Step 1: Understand blue-green deployment concept

    Blue-green deployment uses two identical environments to avoid downtime during updates.
  2. Step 2: Identify the main goal

    The main goal is to switch traffic between environments to keep the system available without interruption.
  3. Final Answer:

    To reduce downtime by switching traffic between two identical environments -> Option C
  4. Quick Check:

    Blue-green deployment = reduce downtime [OK]
Hint: Blue-green means two environments for zero downtime [OK]
Common Mistakes:
  • Confusing deployment with scaling
  • Thinking it improves database speed
  • Assuming it merges microservices
2. Which of the following is the correct sequence in a blue-green deployment?
easy
A. Deploy to green, test, switch traffic from blue to green
B. Deploy to blue, switch traffic, then test on green
C. Switch traffic first, then deploy to blue
D. Deploy to green, switch traffic, then test on blue

Solution

  1. Step 1: Recall deployment steps

    In blue-green deployment, new code is deployed to the inactive environment (green).
  2. Step 2: Test and switch traffic

    After testing green, traffic is switched from blue (active) to green (new).
  3. Final Answer:

    Deploy to green, test, switch traffic from blue to green -> Option A
  4. Quick Check:

    Deploy-test-switch = A [OK]
Hint: Deploy to inactive env, test, then switch traffic [OK]
Common Mistakes:
  • Switching traffic before testing
  • Testing on active environment
  • Deploying after switching traffic
3. Consider this simplified code snippet for switching traffic in blue-green deployment:
current_env = "blue"
new_env = "green"
if current_env == "blue":
    current_env = new_env
else:
    current_env = "blue"
print(current_env)
What will be the output?
medium
A. "blue"
B. None
C. SyntaxError
D. "green"

Solution

  1. Step 1: Analyze initial variables

    current_env starts as "blue", new_env is "green".
  2. Step 2: Evaluate the if condition

    Since current_env == "blue", it sets current_env = new_env, which is "green".
  3. Final Answer:

    "green" -> Option D
  4. Quick Check:

    Switching from blue to green prints green [OK]
Hint: If current is blue, switch to green [OK]
Common Mistakes:
  • Confusing assignment direction
  • Expecting original value to print
  • Thinking code has syntax error
4. A team uses blue-green deployment but users report downtime during the switch. What is the most likely cause?
medium
A. The old environment was not shut down
B. Traffic was switched before the new environment was fully ready
C. The database was not updated
D. The new environment was tested too long

Solution

  1. Step 1: Understand downtime cause in blue-green

    Downtime usually happens if traffic switches before the new environment is ready to serve requests.
  2. Step 2: Evaluate options

    Old environment running or database update issues don't cause immediate downtime during switch; testing too long delays deployment but not downtime.
  3. Final Answer:

    Traffic was switched before the new environment was fully ready -> Option B
  4. Quick Check:

    Premature traffic switch = downtime [OK]
Hint: Switch traffic only after new env is ready [OK]
Common Mistakes:
  • Assuming old env causes downtime
  • Ignoring readiness checks
  • Blaming database updates for switch downtime
5. You manage a critical microservices system using blue-green deployment. After switching traffic to green, you discover a severe bug. What is the best immediate action to minimize downtime?
hard
A. Switch traffic back to blue environment immediately
B. Fix the bug in green environment and keep traffic there
C. Restart both environments simultaneously
D. Deploy a new environment and switch traffic there

Solution

  1. Step 1: Understand rollback in blue-green deployment

    Blue-green allows quick rollback by switching traffic back to the previous stable environment (blue).
  2. Step 2: Evaluate options for minimizing downtime

    Fixing bug in green delays recovery; restarting both causes downtime; deploying new environment takes time.
  3. Final Answer:

    Switch traffic back to blue environment immediately -> Option A
  4. Quick Check:

    Rollback by switching traffic = minimize downtime [OK]
Hint: Rollback by switching traffic to old env fast [OK]
Common Mistakes:
  • Trying to fix bug before rollback
  • Restarting both environments causing downtime
  • Deploying new env wastes time