MLOpsdevops~10 mins

Data drift detection in MLOps - Step-by-Step Execution

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Process Flow - Data drift detection

Collect baseline data

↓

Train model on baseline

↓

Collect new incoming data

↓

Compare new data to baseline

↓

Calculate drift metrics

↓

Is drift above threshold?

No→Continue monitoring

Yes↓

Trigger alert or retrain model

This flow shows how data drift detection compares new data to baseline data, calculates drift, and triggers alerts if drift is significant.

Execution Sample

MLOps

baseline = load_data('baseline.csv')
new_data = load_data('new.csv')
drift_score = calculate_drift(baseline, new_data)
if drift_score > 0.3:
    alert('Data drift detected')

This code loads baseline and new data, calculates a drift score, and alerts if the score exceeds 0.3.

Process Table

Step	Action	Data Compared	Drift Score	Decision	Output
1	Load baseline data	baseline.csv	-	N/A	Baseline data loaded
2	Load new data	new.csv	-	N/A	New data loaded
3	Calculate drift score	baseline vs new	0.25	0.25 <= 0.3	No alert
4	Calculate drift score	baseline vs new	0.35	0.35 > 0.3	Alert triggered: Data drift detected

💡 Execution stops after alert triggered or no alert if drift score is below threshold

Status Tracker

Variable	Start	After Step 1	After Step 2	After Step 3	After Step 4
baseline	None	Loaded baseline.csv data	Loaded baseline.csv data	Loaded baseline.csv data	Loaded baseline.csv data
new_data	None	None	Loaded new.csv data	Loaded new.csv data	Loaded new.csv data
drift_score	None	None	None	0.25 or 0.35 depending on data	0.25 or 0.35 depending on data

Key Moments - 2 Insights

Why do we compare new data to baseline data instead of just looking at new data alone?

What does the drift_score represent and why is there a threshold?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table, what is the drift_score at step 3?

A0.3

B0.25

C0.35

DNo score calculated

Concept Snapshot

Data drift detection compares new data to baseline data.
Calculate a drift score to measure difference.
If drift score > threshold, trigger alert or retrain.
Common threshold example: 0.3.
Helps keep ML models accurate over time.

Full Transcript

Data drift detection is a process where we first collect baseline data and train a model on it. Then, as new data comes in, we compare it to the baseline data to see if the data distribution has changed. We calculate a drift score to quantify this difference. If the drift score is above a set threshold, for example 0.3, we trigger an alert to notify that data drift has occurred. This helps us know when to retrain or adjust our machine learning models to keep them accurate. The execution steps include loading baseline and new data, calculating the drift score, and deciding whether to alert based on the threshold.

Practice

(1/5)

1. What is the main purpose of data drift detection in MLOps?

easy

A. To reduce the size of the dataset

B. To check if new data differs significantly from the training data

C. To improve the speed of model training

D. To increase the number of features in the model

Data drift detection in MLOps - Step-by-Step Execution

Start learning this pattern below

Practice

Solution

Step 1: Understand data drift concept

Step 2: Identify the purpose of detection

Final Answer:

Quick Check:

Solution

Step 1: Recall common MLOps tools

Step 2: Differentiate from other libraries

Final Answer:

Quick Check:

Solution

Step 1: Understand Evidently report usage

Step 2: Identify the purpose of the method

Final Answer:

Quick Check:

Solution

Step 1: Check Dashboard.run() method requirements

Step 2: Identify missing argument

Final Answer:

Quick Check:

Solution

Step 1: Understand automation in MLOps

Step 2: Identify best practice

Final Answer:

Quick Check: