0
0
MLOpsdevops~10 mins

Trigger-based retraining (schedule, drift, performance) in MLOps - Commands & Configuration

Choose your learning style9 modes available
Introduction
Machine learning models can lose accuracy over time as data changes. Trigger-based retraining helps keep models fresh by automatically retraining them when certain conditions happen, like on a schedule or when performance drops.
When you want to retrain a model every week to keep it updated with new data.
When model accuracy drops below a set threshold and you want to retrain automatically.
When data changes significantly (data drift) and you want to trigger retraining to adapt.
When you want to automate retraining without manual checks to save time.
When you want to monitor model performance and retrain only when needed to save resources.
Config File - retrain_pipeline.py
retrain_pipeline.py
import mlflow
import time

# Function to check model performance
# Returns True if retraining needed

def check_performance():
    # Simulate performance check
    accuracy = mlflow.get_metric_history('model_accuracy')[-1].value
    return accuracy < 0.8

# Function to check data drift
# Returns True if drift detected

def check_data_drift():
    # Simulate data drift detection
    drift_score = 0.3  # example drift score
    return drift_score > 0.25

# Retrain model function

def retrain_model():
    print("Retraining model...")
    # Simulate retraining
    mlflow.log_metric("model_accuracy", 0.85)
    print("Retraining complete.")

# Main loop to trigger retraining

while True:
    if check_performance():
        retrain_model()
    elif check_data_drift():
        retrain_model()
    else:
        print("No retraining needed.")
    time.sleep(86400)  # wait 24 hours before next check

This Python script uses MLflow to track model accuracy and simulate retraining triggers.

check_performance() checks if accuracy is below 0.8 to trigger retraining.

check_data_drift() simulates data drift detection with a threshold.

retrain_model() simulates retraining and logs improved accuracy.

The main loop runs daily to check triggers and retrain if needed.

Commands
Run the retraining script to start monitoring model performance and data drift. It will retrain the model automatically when triggers are met.
Terminal
python retrain_pipeline.py
Expected OutputExpected
No retraining needed. No retraining needed. Retraining model... Retraining complete. No retraining needed.
List all tracked metrics in MLflow to verify model accuracy metrics are logged.
Terminal
mlflow metrics list
Expected OutputExpected
model_accuracy
Get the latest value of the model_accuracy metric to check current model performance.
Terminal
mlflow metrics get model_accuracy
Expected OutputExpected
0.85
Key Concept

If you remember nothing else from this pattern, remember: retrain your model automatically when performance drops or data changes to keep it accurate.

Code Example
MLOps
import mlflow
import time

# Simulate metric history storage
mlflow.set_tracking_uri('file:///tmp/mlruns')

# Log initial accuracy
mlflow.start_run()
mlflow.log_metric('model_accuracy', 0.9)
mlflow.end_run()

# Function to check performance

def check_performance():
    history = mlflow.get_metric_history('model_accuracy')
    accuracy = history[-1].value if history else 0
    return accuracy < 0.8

# Function to simulate retraining

def retrain_model():
    print('Retraining model...')
    mlflow.start_run()
    mlflow.log_metric('model_accuracy', 0.85)
    mlflow.end_run()
    print('Retraining complete.')

# Main trigger check

if check_performance():
    retrain_model()
else:
    print('No retraining needed.')
OutputSuccess
Common Mistakes
Not checking model performance before retraining
This causes unnecessary retraining, wasting time and resources.
Always check if model accuracy is below a threshold before triggering retraining.
Ignoring data drift detection
Model may become outdated if data changes but retraining is not triggered.
Implement data drift checks to trigger retraining when input data changes significantly.
Running retraining too frequently without schedule
Can overload resources and cause instability.
Use a schedule or cooldown period between retraining runs.
Summary
Use a script to check model accuracy and data drift regularly.
Trigger retraining only when performance drops below a threshold or data drift is detected.
Log metrics with MLflow to track model performance over time.