0
0
MLOpsdevops~5 mins

Rollback strategies for failed updates in MLOps - Commands & Configuration

Choose your learning style9 modes available
Introduction
When you update a machine learning model or pipeline, sometimes the new version can cause errors or worse results. Rollback strategies help you quickly return to a previous stable version to keep your system working smoothly.
When a new model version causes prediction errors or crashes in production
When a pipeline update breaks data processing steps unexpectedly
When performance metrics drop after deploying a new model
When you want to test a new model but keep the option to revert easily
When you need to maintain service availability during model updates
Commands
This command starts serving the current stable model version on port 1234 so your app can use it.
Terminal
mlflow models serve -m runs:/1234567890abcdef/model -p 1234
Expected OutputExpected
2024/06/01 12:00:00 INFO mlflow.models.cli: Starting MLflow model server for model 'runs:/1234567890abcdef/model' on port 1234 2024/06/01 12:00:00 INFO mlflow.models.cli: Listening on http://127.0.0.1:1234
-m - Specifies the model URI to serve
-p - Sets the port number for the server
If the new model version causes problems, this command rolls back by serving the previous stable model version on the same port.
Terminal
mlflow models serve -m runs:/fedcba0987654321/model -p 1234
Expected OutputExpected
2024/06/01 12:05:00 INFO mlflow.models.cli: Starting MLflow model server for model 'runs:/fedcba0987654321/model' on port 1234 2024/06/01 12:05:00 INFO mlflow.models.cli: Listening on http://127.0.0.1:1234
-m - Specifies the model URI to serve
-p - Sets the port number for the server
This command opens the MLflow tracking UI where you can compare model versions and decide which one to roll back to.
Terminal
mlflow ui
Expected OutputExpected
2024/06/01 12:00:00 INFO mlflow.server: Starting MLflow tracking UI at http://127.0.0.1:5000
Key Concept

If a new model update fails, quickly switch back to the last stable model version to keep your system running smoothly.

Code Example
MLOps
import mlflow
from mlflow.models import infer_signature
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
import numpy as np

# Load data
iris = load_iris()
X, y = iris.data, iris.target

# Train a simple model
model = LogisticRegression(max_iter=100)
model.fit(X, y)

# Log model with MLflow
with mlflow.start_run() as run:
    signature = infer_signature(X, model.predict(X))
    mlflow.sklearn.log_model(model, "model", signature=signature)
    print(f"Model logged in run {run.info.run_id}")

# To rollback, serve a previous run's model URI
# Example: mlflow models serve -m runs:/previous_run_id/model -p 1234
OutputSuccess
Common Mistakes
Not stopping the failed model server before starting the rollback version
The port remains in use, so the rollback server cannot start and serve requests
Stop the current model server process before running the rollback serve command
Deploying a new model without testing it in a staging environment first
You risk downtime or bad predictions in production without a safe rollback plan
Test new models in a separate environment and use MLflow UI to compare metrics before production deployment
Summary
Use 'mlflow models serve' to deploy a specific model version for predictions.
If the new model causes issues, serve the previous stable model version to rollback.
Use 'mlflow ui' to compare model versions and decide which to deploy or rollback.