MLOpsdevops~10 mins

Blue-green deployment for models in MLOps - Commands & Configuration

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Introduction

Blue-green deployment helps you update machine learning models without stopping the service. It keeps two versions of the model running so users always get a working model while you test the new one.

When you want to update a model without causing downtime for users.

When you need to test a new model version with real traffic before fully switching.

When you want to quickly roll back to the old model if the new one has problems.

When you deploy models in production and want to avoid service interruptions.

When you want to compare performance between two model versions in real time.

Commands

Start serving the current stable model version (blue) on port 1234 so users can access it.

Terminal

mlflow models serve -m models:/my-model/blue -p 1234

Expected OutputExpected

2024/06/01 12:00:00 INFO mlflow.models.cli: Starting MLflow model server for model 'my-model' version 'blue' on port 1234 2024/06/01 12:00:00 INFO mlflow.models.cli: Listening on http://0.0.0.0:1234

→

-m - Specify the model URI to serve

→

-p - Set the port number for the server

Start serving the new model version (green) on a different port 1235 to test it without affecting users.

Terminal

mlflow models serve -m models:/my-model/green -p 1235

Expected OutputExpected

2024/06/01 12:01:00 INFO mlflow.models.cli: Starting MLflow model server for model 'my-model' version 'green' on port 1235 2024/06/01 12:01:00 INFO mlflow.models.cli: Listening on http://0.0.0.0:1235

→

-m - Specify the model URI to serve

→

-p - Set the port number for the server

Send a test prediction request to the green model to verify it works correctly before switching traffic.

Terminal

curl -X POST http://localhost:1235/invocations -H 'Content-Type: application/json' -d '{"data": [[5.1, 3.5, 1.4, 0.2]]}'

Expected OutputExpected

{"predictions": [0]}

Switch the production stage to the green model version so all users start using the new model.

Terminal

mlflow models transition --model-name my-model --stage Production --version green

Expected OutputExpected

Model 'my-model' version 'green' is now in stage 'Production'.

Stop serving the old blue model version after confirming the green version works well.

Terminal

mlflow models stop -m models:/my-model/blue

Expected OutputExpected

Stopped serving model 'my-model' version 'blue'.

→

-m - Specify the model URI to stop serving

Key Concept

If you remember nothing else from this pattern, remember: run two model versions side-by-side, test the new one, then switch traffic smoothly without downtime.

Code Example

MLOps

import mlflow
import requests
import json

# Serve blue model (assumed running externally)

# Serve green model (assumed running externally)

# Test green model prediction
input_data = {"data": [[5.1, 3.5, 1.4, 0.2]]}
response = requests.post("http://localhost:1235/invocations", headers={"Content-Type": "application/json"}, data=json.dumps(input_data))
print("Green model prediction response:", response.text)

# Transition production to green model
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage(name="my-model", version=1, stage="Production")
print("Switched production to green model version.")

OutputSuccess

Common Mistakes

Stopping the old model before verifying the new model works.

This causes downtime if the new model has errors or is not ready.

Always run both models simultaneously and test the new one before switching.

Serving both models on the same port.

Only one service can listen on a port, so the second server will fail to start.

Use different ports for blue and green model servers during testing.

Not switching the production stage in the model registry.

Traffic will continue going to the old model, so the new model won't be used.

Use the model registry to update the production stage to the new model version.

Summary

Start serving the current stable model (blue) on one port.

Serve the new model (green) on a different port to test it safely.

Send test requests to the green model to verify correctness.

Switch production traffic to the green model using the model registry.

Stop the old blue model after confirming the new model works well.

Practice

(1/5)

1. What is the main purpose of blue-green deployment in model updates?

easy

A. To run two models at the same time and combine their outputs

B. To switch traffic to a new model only after it is fully tested and ready

C. To update the model directly in the production environment without backup

D. To deploy models only during off-peak hours

Blue-green deployment for models in MLOps - Commands & Configuration

Start learning this pattern below

Practice

Solution

Step 1: Understand blue-green deployment concept

Step 2: Identify the key purpose

Final Answer:

Quick Check:

Solution

Step 1: Understand traffic switching in Kubernetes

Step 2: Identify the command that changes service selector to green

Final Answer:

Quick Check:

Solution

Step 1: Analyze the condition in the script

Step 2: Determine the printed output

Final Answer:

Quick Check:

Solution

Step 1: Understand traffic routing in blue-green deployment

Step 2: Identify why traffic still hits blue

Final Answer:

Quick Check:

Solution

Step 1: Deploy and test new model in green environment

Step 2: Switch traffic to green, monitor, then clean up blue

Final Answer:

Quick Check: