Bird
Raised Fist0
MLOpsdevops~10 mins

Trigger-based retraining (schedule, drift, performance) in MLOps - Commands & Configuration

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Machine learning models can lose accuracy over time as data changes. Trigger-based retraining helps keep models fresh by automatically retraining them when certain conditions happen, like on a schedule or when performance drops.
When you want to retrain a model every week to keep it updated with new data.
When model accuracy drops below a set threshold and you want to retrain automatically.
When data changes significantly (data drift) and you want to trigger retraining to adapt.
When you want to automate retraining without manual checks to save time.
When you want to monitor model performance and retrain only when needed to save resources.
Config File - retrain_pipeline.py
retrain_pipeline.py
import mlflow
import time

# Function to check model performance
# Returns True if retraining needed

def check_performance():
    # Simulate performance check
    accuracy = mlflow.get_metric_history('model_accuracy')[-1].value
    return accuracy < 0.8

# Function to check data drift
# Returns True if drift detected

def check_data_drift():
    # Simulate data drift detection
    drift_score = 0.3  # example drift score
    return drift_score > 0.25

# Retrain model function

def retrain_model():
    print("Retraining model...")
    # Simulate retraining
    mlflow.log_metric("model_accuracy", 0.85)
    print("Retraining complete.")

# Main loop to trigger retraining

while True:
    if check_performance():
        retrain_model()
    elif check_data_drift():
        retrain_model()
    else:
        print("No retraining needed.")
    time.sleep(86400)  # wait 24 hours before next check

This Python script uses MLflow to track model accuracy and simulate retraining triggers.

check_performance() checks if accuracy is below 0.8 to trigger retraining.

check_data_drift() simulates data drift detection with a threshold.

retrain_model() simulates retraining and logs improved accuracy.

The main loop runs daily to check triggers and retrain if needed.

Commands
Run the retraining script to start monitoring model performance and data drift. It will retrain the model automatically when triggers are met.
Terminal
python retrain_pipeline.py
Expected OutputExpected
No retraining needed. No retraining needed. Retraining model... Retraining complete. No retraining needed.
List all tracked metrics in MLflow to verify model accuracy metrics are logged.
Terminal
mlflow metrics list
Expected OutputExpected
model_accuracy
Get the latest value of the model_accuracy metric to check current model performance.
Terminal
mlflow metrics get model_accuracy
Expected OutputExpected
0.85
Key Concept

If you remember nothing else from this pattern, remember: retrain your model automatically when performance drops or data changes to keep it accurate.

Code Example
MLOps
import mlflow
import time

# Simulate metric history storage
mlflow.set_tracking_uri('file:///tmp/mlruns')

# Log initial accuracy
mlflow.start_run()
mlflow.log_metric('model_accuracy', 0.9)
mlflow.end_run()

# Function to check performance

def check_performance():
    history = mlflow.get_metric_history('model_accuracy')
    accuracy = history[-1].value if history else 0
    return accuracy < 0.8

# Function to simulate retraining

def retrain_model():
    print('Retraining model...')
    mlflow.start_run()
    mlflow.log_metric('model_accuracy', 0.85)
    mlflow.end_run()
    print('Retraining complete.')

# Main trigger check

if check_performance():
    retrain_model()
else:
    print('No retraining needed.')
OutputSuccess
Common Mistakes
Not checking model performance before retraining
This causes unnecessary retraining, wasting time and resources.
Always check if model accuracy is below a threshold before triggering retraining.
Ignoring data drift detection
Model may become outdated if data changes but retraining is not triggered.
Implement data drift checks to trigger retraining when input data changes significantly.
Running retraining too frequently without schedule
Can overload resources and cause instability.
Use a schedule or cooldown period between retraining runs.
Summary
Use a script to check model accuracy and data drift regularly.
Trigger retraining only when performance drops below a threshold or data drift is detected.
Log metrics with MLflow to track model performance over time.

Practice

(1/5)
1. What is the main purpose of trigger-based retraining in machine learning operations?
easy
A. Automatically update models when data or performance changes
B. Manually retrain models on a fixed schedule
C. Store training data in a database
D. Visualize model performance metrics

Solution

  1. Step 1: Understand trigger-based retraining concept

    Trigger-based retraining means models update automatically when certain conditions happen, like data changes or performance drops.
  2. Step 2: Compare options to concept

    Only Automatically update models when data or performance changes describes automatic updates based on triggers, matching the concept.
  3. Final Answer:

    Automatically update models when data or performance changes -> Option A
  4. Quick Check:

    Trigger-based retraining = automatic updates [OK]
Hint: Triggers mean automatic updates, not manual tasks [OK]
Common Mistakes:
  • Confusing manual retraining with trigger-based retraining
  • Thinking triggers only store data
  • Assuming triggers visualize data
2. Which SQL statement correctly creates a trigger to start retraining after new data is inserted into a table named training_data?
easy
A. CREATE retrain_trigger AFTER INSERT ON training_data CALL start_retraining();
B. INSERT TRIGGER retrain_trigger ON training_data AFTER EXEC start_retraining();"
C. TRIGGER CREATE retrain_trigger ON training_data AFTER INSERT EXEC start_retraining();
D. CREATE TRIGGER retrain_trigger AFTER INSERT ON training_data FOR EACH ROW EXECUTE PROCEDURE start_retraining();

Solution

  1. Step 1: Recall correct SQL trigger syntax

    Standard SQL triggers use CREATE TRIGGER, specify timing (AFTER), event (INSERT), table, and procedure to execute.
  2. Step 2: Match syntax to options

    CREATE TRIGGER retrain_trigger AFTER INSERT ON training_data FOR EACH ROW EXECUTE PROCEDURE start_retraining(); matches correct syntax: CREATE TRIGGER retrain_trigger AFTER INSERT ON training_data FOR EACH ROW EXECUTE PROCEDURE start_retraining();
  3. Final Answer:

    CREATE TRIGGER retrain_trigger AFTER INSERT ON training_data FOR EACH ROW EXECUTE PROCEDURE start_retraining(); -> Option D
  4. Quick Check:

    Correct trigger syntax = CREATE TRIGGER retrain_trigger AFTER INSERT ON training_data FOR EACH ROW EXECUTE PROCEDURE start_retraining(); [OK]
Hint: Look for 'CREATE TRIGGER ... EXECUTE PROCEDURE' pattern [OK]
Common Mistakes:
  • Using CALL instead of EXECUTE PROCEDURE
  • Wrong order of keywords
  • Missing FOR EACH ROW clause
3. Given this trigger function in PostgreSQL:
CREATE OR REPLACE FUNCTION check_drift() RETURNS trigger AS $$
BEGIN
  IF NEW.error_rate > 0.1 THEN
    PERFORM start_retraining();
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

What happens when a new row with error_rate = 0.15 is inserted?
medium
A. The retraining procedure is called because error_rate > 0.1
B. Nothing happens because triggers don't run on INSERT
C. An error occurs due to syntax mistake
D. The row is rejected and not inserted

Solution

  1. Step 1: Analyze trigger function logic

    The function checks if NEW.error_rate > 0.1; if true, it calls start_retraining().
  2. Step 2: Apply condition to given data

    Since error_rate is 0.15, which is greater than 0.1, the retraining procedure is called.
  3. Final Answer:

    The retraining procedure is called because error_rate > 0.1 -> Option A
  4. Quick Check:

    error_rate 0.15 > 0.1 triggers retraining [OK]
Hint: Check condition in trigger function with inserted data [OK]
Common Mistakes:
  • Thinking triggers don't run on INSERT
  • Assuming syntax error without checking code
  • Believing row insertion fails
4. You wrote this trigger to start retraining on performance drop:
CREATE TRIGGER retrain_on_drop
AFTER UPDATE ON model_metrics
FOR EACH ROW
WHEN (NEW.accuracy < OLD.accuracy)
EXECUTE PROCEDURE start_retraining();

But retraining never starts. What is the likely problem?
medium
A. Triggers cannot run AFTER UPDATE events
B. The WHEN clause is not supported in all SQL dialects
C. start_retraining() must be a procedure, not a function
D. The trigger name is invalid

Solution

  1. Step 1: Understand WHEN clause support

    Not all SQL databases support the WHEN clause in triggers; some require condition checks inside the function.
  2. Step 2: Identify why retraining doesn't start

    If the database ignores the WHEN clause, the condition is never checked, so retraining never triggers.
  3. Final Answer:

    The WHEN clause is not supported in all SQL dialects -> Option B
  4. Quick Check:

    WHEN clause support varies by SQL dialect [OK]
Hint: Check if your SQL dialect supports WHEN in triggers [OK]
Common Mistakes:
  • Assuming triggers can't run AFTER UPDATE
  • Confusing functions and procedures
  • Thinking trigger names cause failure
5. You want to design a trigger-based retraining system that retrains a model only if both the data drift exceeds threshold and model accuracy drops below 90%. Which approach is best?
hard
A. Manually retrain the model when you notice performance issues
B. Create two separate triggers: one for drift and one for accuracy, each retraining independently
C. Create a trigger that calls a procedure checking both drift and accuracy before retraining
D. Schedule retraining daily regardless of drift or accuracy

Solution

  1. Step 1: Understand combined condition requirement

    The retraining should happen only if both drift and accuracy conditions are met together.
  2. Step 2: Evaluate options for combined logic

    Create a trigger that calls a procedure checking both drift and accuracy before retraining uses a single trigger calling a procedure that checks both conditions before retraining, ensuring both must be true.
  3. Step 3: Why other options fail

    Create two separate triggers: one for drift and one for accuracy, each retraining independently retrains independently on each condition, not requiring both. Schedule retraining daily regardless of drift or accuracy ignores conditions. Manually retrain the model when you notice performance issues is manual, not trigger-based.
  4. Final Answer:

    Create a trigger that calls a procedure checking both drift and accuracy before retraining -> Option C
  5. Quick Check:

    Combined condition needs single trigger with logic [OK]
Hint: Use one trigger with combined condition check procedure [OK]
Common Mistakes:
  • Using separate triggers causing unnecessary retraining
  • Ignoring condition checks in triggers
  • Relying on manual retraining only