MLOpsdevops~10 mins

Trigger-based retraining (schedule, drift, performance) in MLOps - Commands & Configuration

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Introduction

Machine learning models can lose accuracy over time as data changes. Trigger-based retraining helps keep models fresh by automatically retraining them when certain conditions happen, like on a schedule or when performance drops.

When you want to retrain a model every week to keep it updated with new data.

When model accuracy drops below a set threshold and you want to retrain automatically.

When data changes significantly (data drift) and you want to trigger retraining to adapt.

When you want to automate retraining without manual checks to save time.

When you want to monitor model performance and retrain only when needed to save resources.

Config File - retrain_pipeline.py

retrain_pipeline.py

import mlflow
import time

# Function to check model performance
# Returns True if retraining needed

def check_performance():
    # Simulate performance check
    accuracy = mlflow.get_metric_history('model_accuracy')[-1].value
    return accuracy < 0.8

# Function to check data drift
# Returns True if drift detected

def check_data_drift():
    # Simulate data drift detection
    drift_score = 0.3  # example drift score
    return drift_score > 0.25

# Retrain model function

def retrain_model():
    print("Retraining model...")
    # Simulate retraining
    mlflow.log_metric("model_accuracy", 0.85)
    print("Retraining complete.")

# Main loop to trigger retraining

while True:
    if check_performance():
        retrain_model()
    elif check_data_drift():
        retrain_model()
    else:
        print("No retraining needed.")
    time.sleep(86400)  # wait 24 hours before next check

This Python script uses MLflow to track model accuracy and simulate retraining triggers.

check_performance() checks if accuracy is below 0.8 to trigger retraining.

check_data_drift() simulates data drift detection with a threshold.

retrain_model() simulates retraining and logs improved accuracy.

The main loop runs daily to check triggers and retrain if needed.

Commands

Run the retraining script to start monitoring model performance and data drift. It will retrain the model automatically when triggers are met.

Terminal

python retrain_pipeline.py

Expected OutputExpected

No retraining needed. No retraining needed. Retraining model... Retraining complete. No retraining needed.

List all tracked metrics in MLflow to verify model accuracy metrics are logged.

Terminal

mlflow metrics list

Expected OutputExpected

model_accuracy

Get the latest value of the model_accuracy metric to check current model performance.

Terminal

mlflow metrics get model_accuracy

Expected OutputExpected

0.85

Key Concept

If you remember nothing else from this pattern, remember: retrain your model automatically when performance drops or data changes to keep it accurate.

Code Example

MLOps

import mlflow
import time

# Simulate metric history storage
mlflow.set_tracking_uri('file:///tmp/mlruns')

# Log initial accuracy
mlflow.start_run()
mlflow.log_metric('model_accuracy', 0.9)
mlflow.end_run()

# Function to check performance

def check_performance():
    history = mlflow.get_metric_history('model_accuracy')
    accuracy = history[-1].value if history else 0
    return accuracy < 0.8

# Function to simulate retraining

def retrain_model():
    print('Retraining model...')
    mlflow.start_run()
    mlflow.log_metric('model_accuracy', 0.85)
    mlflow.end_run()
    print('Retraining complete.')

# Main trigger check

if check_performance():
    retrain_model()
else:
    print('No retraining needed.')

OutputSuccess

Common Mistakes

Not checking model performance before retraining

This causes unnecessary retraining, wasting time and resources.

Always check if model accuracy is below a threshold before triggering retraining.

Ignoring data drift detection

Model may become outdated if data changes but retraining is not triggered.

Implement data drift checks to trigger retraining when input data changes significantly.

Running retraining too frequently without schedule

Can overload resources and cause instability.

Use a schedule or cooldown period between retraining runs.

Summary

Use a script to check model accuracy and data drift regularly.

Trigger retraining only when performance drops below a threshold or data drift is detected.

Log metrics with MLflow to track model performance over time.

Practice

(1/5)

1. What is the main purpose of trigger-based retraining in machine learning operations?

easy

A. Automatically update models when data or performance changes

B. Manually retrain models on a fixed schedule

C. Store training data in a database

D. Visualize model performance metrics

Trigger-based retraining (schedule, drift, performance) in MLOps - Commands & Configuration

Start learning this pattern below

Practice

Solution

Step 1: Understand trigger-based retraining concept

Step 2: Compare options to concept

Final Answer:

Quick Check:

Solution

Step 1: Recall correct SQL trigger syntax

Step 2: Match syntax to options

Final Answer:

Quick Check:

Solution

Step 1: Analyze trigger function logic

Step 2: Apply condition to given data

Final Answer:

Quick Check:

Solution

Step 1: Understand WHEN clause support

Step 2: Identify why retraining doesn't start

Final Answer:

Quick Check:

Solution

Step 1: Understand combined condition requirement

Step 2: Evaluate options for combined logic

Step 3: Why other options fail

Final Answer:

Quick Check: