Evidently AI for monitoring in MLOps - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When using Evidently AI for monitoring machine learning models, it's important to understand how the time to process data grows as the amount of data increases.
We want to know how the monitoring workload changes when we feed more data for analysis.
Analyze the time complexity of the following Evidently AI monitoring code snippet.
from evidently import Report
from evidently.metric_preset import DataDriftPreset
report = Report(metrics=[DataDriftPreset()])
report.run(reference_data, current_data)
report.save_html('monitoring_report.html')
This code creates a dashboard to detect data drift by comparing reference and current datasets, then generates a report.
Look at what repeats when the dashboard calculates data drift.
- Primary operation: Comparing each feature's distribution between reference and current datasets.
- How many times: Once per feature, and for each data point in the datasets.
As the number of data points grows, the time to compare distributions grows roughly in proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Small number of comparisons per feature |
| 100 | About 10 times more comparisons |
| 1000 | About 100 times more comparisons |
Pattern observation: The workload grows linearly with the number of data points and features.
Time Complexity: O(n * f)
This means the time grows proportionally with the number of data points (n) and the number of features (f) being monitored.
[X] Wrong: "The monitoring time stays the same no matter how much data we have."
[OK] Correct: More data means more comparisons to detect drift, so the time increases with data size.
Understanding how monitoring scales with data size helps you design efficient ML pipelines and shows you can think about real-world system performance.
"What if we added more complex drift checks that compare pairs of features? How would the time complexity change?"
Practice
Solution
Step 1: Understand Evidently AI's role
Evidently AI is designed to monitor ML models by checking if data or predictions have changed over time.Step 2: Identify the main function
It compares old and new data or predictions to detect data drift or performance issues.Final Answer:
To compare old and new data or predictions to detect changes -> Option DQuick Check:
Monitoring = Comparing data changes [OK]
- Confusing monitoring with training models
- Thinking Evidently deploys models
- Assuming it preprocesses data
Solution
Step 1: Review Evidently dashboard syntax
The Dashboard class expects a list of tab instances passed as the tabs parameter.Step 2: Check correct instantiation
Tabs must be instantiated with parentheses and passed as a list to the tabs argument.Final Answer:
dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()]) -> Option BQuick Check:
Tabs list with instances = dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()]) [OK]
- Passing classes instead of instances
- Not using a list for tabs
- Incorrect argument syntax
report.save_html('report.html')?
from evidently.dashboard import Dashboard
from evidently.tabs import DataDriftTab
dashboard = Dashboard(tabs=[DataDriftTab()])
report = dashboard.calculate(reference_data, current_data)
report.save_html('report.html')Solution
Step 1: Understand save_html() method
The save_html() method saves the dashboard report as an HTML file to the given path.Step 2: Analyze the code behavior
Calling report.save_html('report.html') creates a file named 'report.html' containing the report content.Final Answer:
A new HTML file named 'report.html' is created with the dashboard report -> Option AQuick Check:
save_html() saves file = A new HTML file named 'report.html' is created with the dashboard report [OK]
- Thinking save_html() returns a string
- Expecting console output instead of file
- Assuming save_html() is invalid
TypeError: 'Dashboard' object is not callable. What is the likely cause?
dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()]) report = dashboard(reference_data, current_data)
Solution
Step 1: Identify the error cause
The error says Dashboard object is not callable, meaning dashboard() is invalid syntax.Step 2: Correct method to generate report
To get a report, you must call dashboard.calculate(reference_data, current_data), not dashboard().Final Answer:
You should call dashboard.calculate() instead of dashboard() -> Option CQuick Check:
Use calculate() method to get report [OK]
- Calling dashboard() directly instead of .calculate()
- Assuming tabs must be strings
- Thinking imports cause this error
Solution
Step 1: Identify tabs for data drift and performance
DataDriftTab monitors changes in input data distribution. ClassificationPerformanceTab tracks model prediction quality for classification tasks.Step 2: Choose correct combination for monitoring
To monitor both data drift and model performance for classification, use DataDriftTab and ClassificationPerformanceTab together.Final Answer:
DataDriftTab and ClassificationPerformanceTab -> Option AQuick Check:
Data drift + classification performance = DataDriftTab and ClassificationPerformanceTab [OK]
- Mixing performance tabs for regression with classification
- Using DataQualityTab instead of DataDriftTab
- Choosing two performance tabs without data drift
