Trigger-based retraining (schedule, drift, performance) in MLOps - Time & Space Complexity
When retraining machine learning models based on triggers like schedule, data drift, or performance, it's important to know how the time to decide and run retraining grows as data or checks increase.
We want to understand how the retraining process scales with more data and more frequent checks.
Analyze the time complexity of the following retraining trigger check.
for batch in data_batches:
drift_score = calculate_drift(batch)
performance = evaluate_model(batch)
if drift_score > drift_threshold or performance < perf_threshold:
retrain_model()
break
wait_until_next_schedule()
This code checks each batch for drift and performance, triggers retraining if needed, or waits for the next scheduled check.
Look at what repeats as input grows.
- Primary operation: Looping over data batches to calculate drift and evaluate performance.
- How many times: Once per batch until retraining triggers or all batches checked.
As the number of data batches grows, the checks increase linearly.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 drift and performance checks |
| 100 | About 100 checks |
| 1000 | Up to 1000 checks |
Pattern observation: The number of operations grows directly with the number of batches checked.
Time Complexity: O(n)
This means the time to decide on retraining grows in a straight line with the number of data batches checked.
[X] Wrong: "Retraining checks happen instantly no matter how much data there is."
[OK] Correct: Each batch requires separate checks, so more batches mean more time spent before deciding.
Understanding how retraining triggers scale helps you explain system responsiveness and efficiency in real projects.
What if we added parallel processing to check batches simultaneously? How would the time complexity change?