Concept drift detection in MLOps - Time & Space Complexity
When detecting concept drift, we want to know how the time to check for changes grows as data increases.
We ask: How does the detection process scale with more incoming data?
Analyze the time complexity of the following code snippet.
# Sliding window concept drift detection
window_size = 100
for i in range(len(data) - window_size + 1):
window = data[i:i+window_size]
drift_score = calculate_drift(window)
if drift_score > threshold:
alert_drift(i)
This code slides a fixed-size window over the data to check for drift at each step.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop sliding the window over data and calculating drift each time.
- How many times: Once for each position in data minus window size plus one.
As data size grows, the number of windows checked grows roughly the same amount.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | ~(10 - 100 + 1) = 0 (no windows) |
| 100 | ~(100 - 100 + 1) = 1 (one window) |
| 1000 | ~(1000 - 100 + 1) = 901 windows checked |
Pattern observation: Operations grow linearly with data size once data is larger than window.
Time Complexity: O(n)
This means the time to detect drift grows directly with the amount of data.
[X] Wrong: "The detection time stays the same no matter how much data we have."
[OK] Correct: Each new data point adds a new window to check, so time grows with data size.
Understanding how detection time scales helps you design systems that handle growing data smoothly.
"What if we increased the window size as data grows? How would the time complexity change?"