Bird
Raised Fist0
Microservicessystem_design~15 mins

Parallel running in Microservices - Deep Dive

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Overview - Parallel running
What is it?
Parallel running is a method where two versions of a system run side by side at the same time. This allows teams to compare the old system with the new one by processing the same inputs in both. It helps ensure the new system works correctly before fully switching over. This approach reduces risks during upgrades or migrations.
Why it matters
Without parallel running, switching to a new system can cause unexpected failures or data loss, impacting users and business operations. Parallel running provides a safety net by letting teams detect issues early while still using the trusted old system. This reduces downtime and builds confidence in the new system's reliability.
Where it fits
Before learning parallel running, you should understand basic system deployment and testing strategies. After mastering it, you can explore advanced deployment techniques like blue-green deployment and canary releases. Parallel running fits into the broader topic of system migration and release management.
Mental Model
Core Idea
Parallel running means running old and new systems side by side to compare results and ensure smooth transition.
Think of it like...
It's like driving two cars on parallel lanes to see if the new car performs as well as the old one before selling the old car.
┌───────────────┐       ┌───────────────┐
│   Old System  │       │   New System  │
└──────┬────────┘       └──────┬────────┘
       │                       │
       │ Same Input Data       │
       ├───────────────────────┤
       │                       │
       ▼                       ▼
┌───────────────┐       ┌───────────────┐
│ Old Output    │       │ New Output    │
└───────────────┘       └───────────────┘
       │                       │
       └─────────Compare───────┘
Build-Up - 6 Steps
1
FoundationUnderstanding system migration basics
🤔
Concept: Introduce the idea of moving from an old system to a new one and the challenges involved.
When a company updates its software, it needs to move data and users from the old system to the new one. This process is called migration. Challenges include data loss, downtime, and unexpected bugs. Simple migration without checks can cause failures.
Result
Learners understand why migrating systems is tricky and why careful planning is needed.
Knowing the risks of migration sets the stage for why safer methods like parallel running are necessary.
2
FoundationBasics of running two systems simultaneously
🤔
Concept: Explain what it means to run two systems at the same time and why it helps.
Running two systems simultaneously means both receive the same inputs and produce outputs independently. This allows teams to compare results and find differences. It helps catch errors in the new system before fully switching.
Result
Learners grasp the core idea of parallel running as a safety check.
Understanding simultaneous operation is key to seeing how parallel running reduces risk.
3
IntermediateImplementing parallel running in microservices
🤔Before reading on: Do you think parallel running requires duplicating all services or only critical ones? Commit to your answer.
Concept: Show how to apply parallel running in a microservices architecture by duplicating services or routes.
In microservices, parallel running can mean deploying both old and new versions of services. Incoming requests are sent to both versions, and their responses are compared. Not all services need duplication; focus on critical or changed ones to save resources.
Result
Learners see practical ways to run parallel systems in microservices.
Knowing selective duplication balances safety and resource use in real systems.
4
IntermediateHandling data consistency during parallel running
🤔Before reading on: Should the new system write data immediately or wait until fully verified? Commit to your answer.
Concept: Discuss strategies to keep data consistent between old and new systems during parallel running.
Data consistency is crucial. One approach is to write data to both systems simultaneously but only use the old system's data until the new one is verified. Another is to write only to the old system and replay data to the new one for testing. Each has tradeoffs in complexity and risk.
Result
Learners understand how to manage data safely during parallel running.
Understanding data strategies prevents corruption and ensures smooth transition.
5
AdvancedMonitoring and comparing outputs effectively
🤔Before reading on: Do you think manual comparison is enough or automated tools are needed? Commit to your answer.
Concept: Explain how to monitor and compare outputs from both systems to detect differences automatically.
Manual comparison is slow and error-prone. Automated tools can log outputs, compare them, and alert teams on mismatches. Metrics and dashboards help track system health. This automation is essential for large-scale systems.
Result
Learners see how automation improves reliability and speed in parallel running.
Knowing the importance of automation helps scale parallel running safely.
6
ExpertChallenges and surprises in production parallel running
🤔Before reading on: Do you think parallel running eliminates all risks? Commit to your answer.
Concept: Reveal hidden challenges like timing differences, side effects, and resource overhead in real-world parallel running.
Even with parallel running, subtle issues arise. Timing differences can cause outputs to differ even if logic is correct. Side effects like sending emails or payments must be controlled to avoid duplication. Running two systems doubles resource use, impacting cost and performance.
Result
Learners appreciate the complexity and tradeoffs in production environments.
Understanding these challenges prepares teams to design safer, more efficient parallel running setups.
Under the Hood
Parallel running works by duplicating input streams to two systems and capturing their outputs independently. Internally, this requires routing layers or proxies that send identical requests to both systems. Outputs are logged and compared by monitoring tools. Data synchronization mechanisms ensure both systems have consistent state or handle eventual consistency. Side effects are isolated or controlled to prevent duplication.
Why designed this way?
Parallel running was designed to reduce risk during system upgrades by providing a live comparison between old and new systems. Alternatives like big-bang cutovers risk total failure, while parallel running allows gradual validation. The design balances safety with operational complexity and resource cost.
┌───────────────┐
│  User Input   │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│  Router/Proxy │
└──────┬────────┘
       │
 ┌─────┴─────┐
 │           │
 ▼           ▼
┌───────┐ ┌───────┐
│ Old   │ │ New   │
│System │ │System │
└──┬────┘ └──┬────┘
   │         │
   ▼         ▼
┌───────┐ ┌───────┐
│Output │ │Output │
│Logs   │ │Logs   │
└──┬────┘ └──┬────┘
   │         │
   └───Compare─────┐
                   ▼
             ┌───────────┐
             │ Monitoring│
             │ & Alerts  │
             └───────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does parallel running guarantee zero downtime? Commit to yes or no.
Common Belief:Parallel running always eliminates downtime during system upgrades.
Tap to reveal reality
Reality:While parallel running reduces risk, it does not guarantee zero downtime because issues like data sync delays or resource limits can still cause interruptions.
Why it matters:Believing in zero downtime can lead to under-preparedness and unexpected outages during migration.
Quick: Is it safe to let both systems send emails or payments during parallel running? Commit to yes or no.
Common Belief:Both old and new systems can perform all side effects simultaneously without issues.
Tap to reveal reality
Reality:Allowing both systems to perform side effects like sending emails or payments can cause duplicates and confusion. Side effects must be controlled or disabled in the new system during testing.
Why it matters:Ignoring this leads to duplicated actions, harming user trust and business operations.
Quick: Does parallel running mean you must duplicate every microservice? Commit to yes or no.
Common Belief:You must run every service in parallel to do parallel running.
Tap to reveal reality
Reality:Only critical or changed services need parallel running to save resources and complexity. Others can remain on the old system until fully migrated.
Why it matters:Trying to duplicate everything wastes resources and complicates deployment unnecessarily.
Quick: Does output difference always mean the new system is wrong? Commit to yes or no.
Common Belief:Any difference in outputs between old and new systems means the new system has bugs.
Tap to reveal reality
Reality:Differences can arise from timing, non-deterministic processes, or expected improvements. Not all differences indicate errors.
Why it matters:Misinterpreting differences can cause wasted debugging effort and delay deployment.
Expert Zone
1
Timing differences between systems can cause output mismatches even if logic is correct, requiring tolerant comparison methods.
2
Side effects must be carefully isolated or mocked in the new system to avoid duplication during parallel running.
3
Resource overhead from running two systems can impact performance and cost, so selective parallel running is often used.
When NOT to use
Parallel running is not ideal when resource constraints are tight or when side effects cannot be safely isolated. Alternatives like blue-green deployment or canary releases may be better for gradual rollout without full duplication.
Production Patterns
In production, teams often run parallel running only for critical services or features. They use automated monitoring to compare outputs and gradually increase traffic to the new system. Side effects are disabled or routed carefully. Parallel running is combined with feature flags and rollback plans.
Connections
Blue-Green Deployment
Alternative deployment strategy
Understanding parallel running clarifies how blue-green deployment differs by switching traffic fully rather than running systems simultaneously.
Canary Releases
Builds on gradual rollout concepts
Parallel running helps validate new versions before canary releases gradually expose users to changes.
Scientific Experimentation
Shares the pattern of control and test groups
Parallel running mirrors running control and test groups in experiments to compare outcomes before full adoption.
Common Pitfalls
#1Allowing both systems to perform side effects like sending emails or processing payments.
Wrong approach:Both systems send confirmation emails to users simultaneously during parallel running.
Correct approach:Disable email sending in the new system or route side effects only through the old system during parallel running.
Root cause:Misunderstanding that side effects can cause duplication and user confusion if not controlled.
#2Duplicating all microservices regardless of importance or change scope.
Wrong approach:Deploy every microservice twice for parallel running, even unchanged ones.
Correct approach:Only duplicate critical or updated microservices to optimize resources and reduce complexity.
Root cause:Assuming parallel running requires full system duplication without considering cost and complexity.
#3Manually comparing outputs from old and new systems without automation.
Wrong approach:Team members read logs line by line to find differences after parallel running.
Correct approach:Use automated tools to log, compare, and alert on output differences efficiently.
Root cause:Underestimating the scale and speed needed for reliable output comparison.
Key Takeaways
Parallel running runs old and new systems side by side to safely test new versions before full migration.
It reduces risk by allowing comparison of outputs and catching errors early without disrupting users.
Managing data consistency and side effects carefully is critical to avoid corruption and duplication.
Automation in monitoring and comparison is essential for scaling parallel running in production.
Parallel running has tradeoffs in resource use and complexity, so selective application is common.

Practice

(1/5)
1. What is the main purpose of parallel running in microservices?
easy
A. To run old and new systems together to ensure smooth transition
B. To replace the old system immediately without testing
C. To run only the new system and discard the old one
D. To run multiple unrelated services in parallel

Solution

  1. Step 1: Understand the concept of parallel running

    Parallel running means running old and new systems side by side to compare their outputs and ensure the new system works correctly.
  2. Step 2: Identify the purpose in microservices

    This approach helps catch errors and ensures a smooth transition before fully switching to the new system.
  3. Final Answer:

    To run old and new systems together to ensure smooth transition -> Option A
  4. Quick Check:

    Parallel running = run old and new systems together [OK]
Hint: Parallel running means running old and new systems side by side [OK]
Common Mistakes:
  • Thinking parallel running means immediate replacement
  • Confusing parallel running with running unrelated services
  • Assuming old system is discarded immediately
2. Which of the following is the correct way to implement parallel running in a microservices upgrade?
easy
A. Deploy new microservice version alongside old one and route a copy of requests to both
B. Stop old microservice and deploy new one immediately
C. Deploy new microservice and ignore old service logs
D. Run new microservice only during off-peak hours

Solution

  1. Step 1: Understand deployment in parallel running

    Parallel running requires both old and new versions to run simultaneously to compare results.
  2. Step 2: Identify correct routing method

    Routing a copy of requests to both versions allows output comparison without disrupting users.
  3. Final Answer:

    Deploy new microservice version alongside old one and route a copy of requests to both -> Option A
  4. Quick Check:

    Parallel running = deploy both and route requests to both [OK]
Hint: Route requests to both old and new services in parallel [OK]
Common Mistakes:
  • Stopping old service before testing new one
  • Ignoring logs from old service
  • Running new service only at specific times
3. Consider a microservice system where requests are sent to both old and new versions during parallel running. If the old service returns response A and the new service returns response B, what should the system do?
medium
A. Ignore the difference and continue using the new service
B. Switch back to the old service permanently
C. Stop the old service immediately
D. Log the difference and alert engineers for investigation

Solution

  1. Step 1: Understand output comparison in parallel running

    Parallel running compares outputs to detect discrepancies between old and new services.
  2. Step 2: Decide action on output mismatch

    If outputs differ, the system should log the difference and alert engineers to investigate before switching fully.
  3. Final Answer:

    Log the difference and alert engineers for investigation -> Option D
  4. Quick Check:

    Output mismatch = log and alert [OK]
Hint: Log and alert on output differences during parallel running [OK]
Common Mistakes:
  • Ignoring output differences
  • Stopping old service too early
  • Switching back permanently without investigation
4. A team implemented parallel running but noticed that the new service never receives any requests. What is the most likely cause?
medium
A. The new service crashed immediately after deployment
B. The routing logic is only sending requests to the old service
C. The old service is not logging requests
D. The new service is slower than the old one

Solution

  1. Step 1: Analyze routing in parallel running

    For parallel running, requests must be routed to both old and new services simultaneously.
  2. Step 2: Identify why new service gets no requests

    If new service never receives requests, routing likely sends all traffic only to old service.
  3. Final Answer:

    The routing logic is only sending requests to the old service -> Option B
  4. Quick Check:

    No requests to new service = routing issue [OK]
Hint: Check routing logic if new service gets no requests [OK]
Common Mistakes:
  • Assuming new service crashed without checking logs
  • Blaming old service logs
  • Thinking speed affects request routing
5. You are designing a parallel running strategy for a microservices system with high traffic. Which approach best balances safety and performance?
hard
A. Route 100% of traffic to new service and keep old service idle
B. Run new service only during low traffic hours without output comparison
C. Route 10% of traffic to new service and 90% to old service, compare outputs, then gradually increase new service traffic
D. Stop old service immediately and monitor new service logs

Solution

  1. Step 1: Understand gradual traffic shifting in parallel running

    Gradually increasing traffic to the new service while comparing outputs reduces risk and performance impact.
  2. Step 2: Evaluate options for safety and performance

    Routing a small portion initially and increasing after validation balances safety and system load.
  3. Final Answer:

    Route 10% of traffic to new service and 90% to old service, compare outputs, then gradually increase new service traffic -> Option C
  4. Quick Check:

    Gradual traffic shift with output comparison = safe and performant [OK]
Hint: Start small traffic to new service, compare, then increase [OK]
Common Mistakes:
  • Switching 100% traffic immediately
  • Skipping output comparison
  • Stopping old service too early