Bird
Raised Fist0
Microservicessystem_design~12 mins

Canary deployment in Microservices - Architecture Diagram

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
System Overview - Canary deployment

Canary deployment is a technique to release new software versions to a small subset of users first. This helps detect issues early without affecting all users. The system routes a small percentage of traffic to the new version while the rest use the stable version.

Architecture Diagram
User
  |
  v
Load Balancer
  |
  v
Traffic Router (Canary Controller)
  /           \
 v             v
Service v1   Service v2 (Canary)
  |             |
  v             v
Database       Database
  |             |
Cache          Cache
Components
User
client
End user making requests
Load Balancer
load_balancer
Distributes incoming traffic evenly to Traffic Router
Traffic Router (Canary Controller)
service
Routes a small percentage of traffic to new version (canary) and rest to stable version
Service v1
service
Stable version of the microservice handling most user requests
Service v2 (Canary)
service
New version deployed to a small subset of users for testing
Database
database
Stores persistent data accessed by both service versions
Cache
cache
Speeds up data access for services
Request Flow - 18 Hops
UserLoad Balancer
Load BalancerTraffic Router (Canary Controller)
Traffic Router (Canary Controller)Service v2 (Canary)
Traffic Router (Canary Controller)Service v1
Service v1Cache
CacheService v1
Service v1Database
DatabaseService v1
Service v1Cache
Service v1Traffic Router (Canary Controller)
Service v2 (Canary)Cache
CacheService v2 (Canary)
Service v2 (Canary)Database
DatabaseService v2 (Canary)
Service v2 (Canary)Cache
Service v2 (Canary)Traffic Router (Canary Controller)
Traffic Router (Canary Controller)Load Balancer
Load BalancerUser
Failure Scenario
Component Fails:Traffic Router (Canary Controller)
Impact:Requests cannot be routed to either stable or canary services, causing service unavailability.
Mitigation:Use redundant Traffic Router instances with health checks and automatic failover to ensure routing continues without interruption.
Architecture Quiz - 3 Questions
Test your understanding
Which component decides what percentage of user requests go to the new version?
ALoad Balancer
BTraffic Router (Canary Controller)
CService v1
DCache
Design Principle
This architecture demonstrates safe software rollout by routing a small portion of traffic to a new version while keeping most users on the stable version. It uses a traffic router to control routing decisions and caching to reduce database load, ensuring minimal user impact during deployment.

Practice

(1/5)
1. What is the main purpose of a canary deployment in microservices?
easy
A. To permanently run two versions side by side
B. To deploy all users to a new version at once
C. To release a new version to a small group of users first to reduce risk
D. To test the new version only in a development environment

Solution

  1. Step 1: Understand the goal of canary deployment

    Canary deployment aims to reduce risk by releasing new software versions to a small subset of users first.
  2. Step 2: Compare options with this goal

    To release a new version to a small group of users first to reduce risk matches this goal exactly, while others describe different deployment strategies.
  3. Final Answer:

    To release a new version to a small group of users first to reduce risk -> Option C
  4. Quick Check:

    Canary deployment = gradual rollout [OK]
Hint: Canary means small test group first, not all users [OK]
Common Mistakes:
  • Confusing canary with blue-green deployment
  • Thinking canary deploys to all users at once
  • Assuming canary is only for testing environments
2. Which of the following is the correct way to control traffic during a canary deployment?
easy
A. Send 100% of traffic to the new version immediately
B. Route a small percentage of traffic to the new version and the rest to the old
C. Stop all traffic during deployment
D. Send traffic randomly without control

Solution

  1. Step 1: Understand traffic control in canary deployment

    Traffic is gradually shifted to the new version to monitor its behavior safely.
  2. Step 2: Identify the correct traffic routing method

    Route a small percentage of traffic to the new version and the rest to the old describes routing a small percentage to the new version while keeping most on the old version, which is correct.
  3. Final Answer:

    Route a small percentage of traffic to the new version and the rest to the old -> Option B
  4. Quick Check:

    Traffic control = gradual routing [OK]
Hint: Gradually shift traffic, never 100% at once [OK]
Common Mistakes:
  • Sending all traffic immediately to new version
  • Stopping traffic completely during deployment
  • Ignoring traffic routing control
3. Consider this simplified code snippet for traffic routing in a canary deployment:
def route_request(user_id):
    if user_id % 10 == 0:
        return "new_version"
    else:
        return "old_version"

print(route_request(20))
print(route_request(23))
What will be the output?
medium
A. "new_version" followed by "old_version"
B. "new_version" followed by "new_version"
C. "old_version" followed by "old_version"
D. "old_version" followed by "new_version"

Solution

  1. Step 1: Evaluate route_request(20)

    20 % 10 equals 0, so it returns "new_version".
  2. Step 2: Evaluate route_request(23)

    23 % 10 equals 3, not 0, so it returns "old_version".
  3. Final Answer:

    "new_version" followed by "old_version" -> Option A
  4. Quick Check:

    Modulo 10 == 0 routes to new version [OK]
Hint: Check modulo condition carefully for routing [OK]
Common Mistakes:
  • Misunderstanding modulo operator
  • Assuming all users go to new version
  • Mixing output order
4. A team implemented a canary deployment but noticed that 100% of users are routed to the new version immediately. What is the most likely cause?
medium
A. Traffic routing logic sends all traffic to new version without percentage control
B. Monitoring tools are not enabled
C. Rollback was triggered accidentally
D. Old version servers are down

Solution

  1. Step 1: Analyze the symptom

    All users routed to new version immediately means no gradual traffic control.
  2. Step 2: Identify the cause

    Traffic routing logic sends all traffic to new version without percentage control explains that routing logic lacks percentage control, causing full traffic shift.
  3. Final Answer:

    Traffic routing logic sends all traffic to new version without percentage control -> Option A
  4. Quick Check:

    Immediate full traffic = missing gradual routing [OK]
Hint: Check traffic routing code for percentage control [OK]
Common Mistakes:
  • Blaming monitoring tools for routing issues
  • Assuming rollback causes full traffic shift
  • Ignoring server status impact
5. You want to design a canary deployment system that automatically rolls back if error rates exceed 5% during rollout. Which combination of components is essential?
hard
A. Load balancer, static routing, manual rollback process
B. Manual deployment script, user feedback form, database backup
C. Continuous integration server, code linter, version control
D. Traffic router, monitoring system, automated rollback controller

Solution

  1. Step 1: Identify components for traffic control and monitoring

    A traffic router directs user requests between old and new versions; monitoring system tracks error rates.
  2. Step 2: Include automated rollback for quick response

    An automated rollback controller triggers rollback if error thresholds are exceeded.
  3. Final Answer:

    Traffic router, monitoring system, automated rollback controller -> Option D
  4. Quick Check:

    Canary needs routing + monitoring + rollback [OK]
Hint: Combine routing, monitoring, and rollback for safe canary [OK]
Common Mistakes:
  • Ignoring automation in rollback
  • Confusing deployment tools with monitoring
  • Missing traffic routing component