Bird
Raised Fist0
MLOpsdevops~10 mins

Concept drift detection in MLOps - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Process Flow - Concept drift detection
Start Monitoring Data Stream
Collect New Data Batch
Compare New Data to Training Data
Calculate Drift Metric
Is Drift Detected?
NoContinue Monitoring
Yes
Trigger Alert or Retrain Model
Update Model or Data Pipeline
Resume Monitoring
The flow shows how new data is monitored continuously, compared to original training data, and if drift is detected, an alert or retraining is triggered before resuming monitoring.
Execution Sample
MLOps
def detect_drift(old_data, new_data, threshold=0.1):
    drift_score = calculate_statistical_distance(old_data, new_data)
    if drift_score > threshold:
        return True
    return False
This function compares old and new data using a statistical distance and returns True if drift is detected above a threshold.
Process Table
StepActionOld Data SummaryNew Data SummaryDrift ScoreDrift DetectedNext Step
1Collect new data batchMean=50, Std=5Mean=52, Std=50.04NoContinue Monitoring
2Collect new data batchMean=50, Std=5Mean=55, Std=60.12YesTrigger Alert
3Retrain modelModel updated with new data---Resume Monitoring
4Collect new data batchMean=55, Std=6Mean=54, Std=60.02NoContinue Monitoring
💡 Monitoring continues until drift score exceeds threshold, then model update occurs and monitoring resumes.
Status Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4
old_data_summaryMean=50, Std=5Mean=50, Std=5Mean=50, Std=5Mean=55, Std=6Mean=55, Std=6
new_data_summary-Mean=52, Std=5Mean=55, Std=6-Mean=54, Std=6
drift_score-0.040.12-0.02
drift_detected-NoYes-No
Key Moments - 3 Insights
Why does the drift score sometimes increase even if the mean changes only slightly?
Because drift score measures statistical distance considering both mean and standard deviation, small changes in either can increase the score as shown in Step 2 of the execution_table.
What happens after drift is detected at Step 2?
The system triggers an alert and retrains the model with new data, updating old_data_summary as shown in Step 3 before resuming monitoring.
Why does monitoring continue after retraining the model?
Because concept drift can happen anytime, continuous monitoring ensures the model stays accurate, as shown by Step 4 where monitoring resumes with updated data.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at Step 2, what is the drift score and was drift detected?
ADrift score is 0.02 and drift detected is No
BDrift score is 0.12 and drift detected is Yes
CDrift score is 0.04 and drift detected is No
DDrift score is 0.12 and drift detected is No
💡 Hint
Check the 'Drift Score' and 'Drift Detected' columns at Step 2 in the execution_table.
At which step does the model get retrained according to the execution_table?
AStep 1
BStep 2
CStep 3
DStep 4
💡 Hint
Look for the action 'Retrain model' in the 'Action' column.
If the threshold was lowered to 0.03, at which step would drift be detected first?
AStep 1
BStep 2
CStep 3
DStep 4
💡 Hint
Compare drift scores in the execution_table to the new threshold of 0.03.
Concept Snapshot
Concept Drift Detection:
- Continuously monitor new data against training data
- Calculate a drift metric (e.g., statistical distance)
- If drift metric > threshold, trigger alert or retrain
- Update model or pipeline accordingly
- Resume monitoring to catch future drifts
Full Transcript
Concept drift detection means watching new data over time to see if it changes from the data the model learned on. We collect new data batches and compare their statistics to the old data. We calculate a drift score that measures how different the new data is. If this score is above a set threshold, we say drift is detected. Then, we alert or retrain the model with new data to keep it accurate. After updating, we keep watching for more drift. This process helps models stay reliable as data changes.

Practice

(1/5)
1. What is the main purpose of concept drift detection in machine learning?
easy
A. To identify when the data distribution changes over time affecting model accuracy
B. To increase the training speed of a machine learning model
C. To reduce the size of the training dataset
D. To improve the hardware performance for model training

Solution

  1. Step 1: Understand concept drift meaning

    Concept drift means the data changes over time, causing model accuracy to drop.
  2. Step 2: Identify the purpose of detection

    Detecting drift helps know when the model needs updating to keep accuracy high.
  3. Final Answer:

    To identify when the data distribution changes over time affecting model accuracy -> Option A
  4. Quick Check:

    Concept drift detection = find data changes [OK]
Hint: Concept drift means data changes; detection finds these changes [OK]
Common Mistakes:
  • Confusing drift detection with speeding up training
  • Thinking drift reduces dataset size
  • Assuming drift improves hardware
2. Which of the following is a correct method to detect concept drift?
easy
A. Reduce the number of model layers
B. Increase the batch size during model training
C. Use a larger learning rate
D. Compare model accuracy on recent data versus older data

Solution

  1. Step 1: Identify drift detection methods

    Drift detection compares model performance on new data to old data to find changes.
  2. Step 2: Evaluate options

    Only comparing accuracy over time relates to drift detection; others affect training but not drift.
  3. Final Answer:

    Compare model accuracy on recent data versus older data -> Option D
  4. Quick Check:

    Drift detection = compare old vs new accuracy [OK]
Hint: Drift detection compares model accuracy over time [OK]
Common Mistakes:
  • Confusing training hyperparameters with drift detection
  • Thinking batch size or learning rate detect drift
  • Ignoring performance comparison over time
3. Given this Python snippet for drift detection:
old_accuracy = 0.85
new_accuracy = 0.70
threshold = 0.1
if old_accuracy - new_accuracy > threshold:
    print('Drift detected')
else:
    print('No drift')

What will be the output?
medium
A. No drift
B. SyntaxError
C. Drift detected
D. No output

Solution

  1. Step 1: Calculate accuracy difference

    old_accuracy - new_accuracy = 0.85 - 0.70 = 0.15
  2. Step 2: Compare difference to threshold

    0.15 > 0.1, so condition is true and 'Drift detected' prints.
  3. Final Answer:

    Drift detected -> Option C
  4. Quick Check:

    0.15 > 0.1 means drift detected [OK]
Hint: Subtract accuracies and compare to threshold [OK]
Common Mistakes:
  • Mixing up greater than and less than signs
  • Ignoring the threshold value
  • Assuming syntax error due to > symbol
4. This code snippet is intended to detect concept drift but has an error:
old_acc = 0.9
new_acc = 0.85
threshold = 0.05
if new_acc - old_acc > threshold:
    print('Drift detected')
else:
    print('No drift')

What is the error?
medium
A. The threshold value is too low to detect drift
B. The condition should be 'old_acc - new_acc > threshold' to detect accuracy drop
C. The print statements are reversed
D. There is a syntax error in the if statement

Solution

  1. Step 1: Understand drift detection logic

    Drift means accuracy drops, so we check if old accuracy minus new accuracy exceeds threshold.
  2. Step 2: Analyze the condition

    The code checks if new_acc - old_acc > threshold, which is negative here (0.85 - 0.9 = -0.05), so it won't detect drift correctly.
  3. Final Answer:

    The condition should be 'old_acc - new_acc > threshold' to detect accuracy drop -> Option B
  4. Quick Check:

    Check accuracy drop as old - new > threshold [OK]
Hint: Subtract new accuracy from old to detect drop [OK]
Common Mistakes:
  • Subtracting in wrong order
  • Assuming threshold value causes error
  • Thinking print statements cause problem
5. You have a model deployed in production. You want to detect concept drift using data distribution changes. Which approach is best to implement?
hard
A. Monitor statistical differences in feature distributions between training and recent data
B. Retrain the model daily regardless of data changes
C. Only monitor model accuracy without checking data
D. Increase model complexity to handle all data variations

Solution

  1. Step 1: Understand concept drift detection methods

    Detecting drift by monitoring data distribution changes helps catch shifts before accuracy drops.
  2. Step 2: Evaluate options for best practice

    Monitor statistical differences in feature distributions between training and recent data uses statistical tests on features, which is a proactive and effective drift detection method. Other options either ignore data changes or waste resources.
  3. Final Answer:

    Monitor statistical differences in feature distributions between training and recent data -> Option A
  4. Quick Check:

    Data distribution monitoring = best drift detection [OK]
Hint: Check feature stats differences to detect drift early [OK]
Common Mistakes:
  • Retraining blindly without drift detection
  • Ignoring data changes and only watching accuracy
  • Assuming bigger models fix drift automatically