0
0
MLOpsdevops~10 mins

Blue-green deployment for models in MLOps - Commands & Configuration

Choose your learning style9 modes available
Introduction
Blue-green deployment helps you update machine learning models without stopping the service. It keeps two versions of the model running so users always get a working model while you test the new one.
When you want to update a model without causing downtime for users.
When you need to test a new model version with real traffic before fully switching.
When you want to quickly roll back to the old model if the new one has problems.
When you deploy models in production and want to avoid service interruptions.
When you want to compare performance between two model versions in real time.
Commands
Start serving the current stable model version (blue) on port 1234 so users can access it.
Terminal
mlflow models serve -m models:/my-model/blue -p 1234
Expected OutputExpected
2024/06/01 12:00:00 INFO mlflow.models.cli: Starting MLflow model server for model 'my-model' version 'blue' on port 1234 2024/06/01 12:00:00 INFO mlflow.models.cli: Listening on http://0.0.0.0:1234
-m - Specify the model URI to serve
-p - Set the port number for the server
Start serving the new model version (green) on a different port 1235 to test it without affecting users.
Terminal
mlflow models serve -m models:/my-model/green -p 1235
Expected OutputExpected
2024/06/01 12:01:00 INFO mlflow.models.cli: Starting MLflow model server for model 'my-model' version 'green' on port 1235 2024/06/01 12:01:00 INFO mlflow.models.cli: Listening on http://0.0.0.0:1235
-m - Specify the model URI to serve
-p - Set the port number for the server
Send a test prediction request to the green model to verify it works correctly before switching traffic.
Terminal
curl -X POST http://localhost:1235/invocations -H 'Content-Type: application/json' -d '{"data": [[5.1, 3.5, 1.4, 0.2]]}'
Expected OutputExpected
{"predictions": [0]}
Switch the production stage to the green model version so all users start using the new model.
Terminal
mlflow models transition --model-name my-model --stage Production --version green
Expected OutputExpected
Model 'my-model' version 'green' is now in stage 'Production'.
Stop serving the old blue model version after confirming the green version works well.
Terminal
mlflow models stop -m models:/my-model/blue
Expected OutputExpected
Stopped serving model 'my-model' version 'blue'.
-m - Specify the model URI to stop serving
Key Concept

If you remember nothing else from this pattern, remember: run two model versions side-by-side, test the new one, then switch traffic smoothly without downtime.

Code Example
MLOps
import mlflow
import requests
import json

# Serve blue model (assumed running externally)

# Serve green model (assumed running externally)

# Test green model prediction
input_data = {"data": [[5.1, 3.5, 1.4, 0.2]]}
response = requests.post("http://localhost:1235/invocations", headers={"Content-Type": "application/json"}, data=json.dumps(input_data))
print("Green model prediction response:", response.text)

# Transition production to green model
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage(name="my-model", version=1, stage="Production")
print("Switched production to green model version.")
OutputSuccess
Common Mistakes
Stopping the old model before verifying the new model works.
This causes downtime if the new model has errors or is not ready.
Always run both models simultaneously and test the new one before switching.
Serving both models on the same port.
Only one service can listen on a port, so the second server will fail to start.
Use different ports for blue and green model servers during testing.
Not switching the production stage in the model registry.
Traffic will continue going to the old model, so the new model won't be used.
Use the model registry to update the production stage to the new model version.
Summary
Start serving the current stable model (blue) on one port.
Serve the new model (green) on a different port to test it safely.
Send test requests to the green model to verify correctness.
Switch production traffic to the green model using the model registry.
Stop the old blue model after confirming the new model works well.