Bird
Raised Fist0
MLOpsdevops~5 mins

Evidently AI for monitoring in MLOps - Time & Space Complexity

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Time Complexity: Evidently AI for monitoring
O(n * f)
Understanding Time Complexity

When using Evidently AI for monitoring machine learning models, it's important to understand how the time to process data grows as the amount of data increases.

We want to know how the monitoring workload changes when we feed more data for analysis.

Scenario Under Consideration

Analyze the time complexity of the following Evidently AI monitoring code snippet.

from evidently import Report
from evidently.metric_preset import DataDriftPreset

report = Report(metrics=[DataDriftPreset()])
report.run(reference_data, current_data)
report.save_html('monitoring_report.html')

This code creates a dashboard to detect data drift by comparing reference and current datasets, then generates a report.

Identify Repeating Operations

Look at what repeats when the dashboard calculates data drift.

  • Primary operation: Comparing each feature's distribution between reference and current datasets.
  • How many times: Once per feature, and for each data point in the datasets.
How Execution Grows With Input

As the number of data points grows, the time to compare distributions grows roughly in proportion.

Input Size (n)Approx. Operations
10Small number of comparisons per feature
100About 10 times more comparisons
1000About 100 times more comparisons

Pattern observation: The workload grows linearly with the number of data points and features.

Final Time Complexity

Time Complexity: O(n * f)

This means the time grows proportionally with the number of data points (n) and the number of features (f) being monitored.

Common Mistake

[X] Wrong: "The monitoring time stays the same no matter how much data we have."

[OK] Correct: More data means more comparisons to detect drift, so the time increases with data size.

Interview Connect

Understanding how monitoring scales with data size helps you design efficient ML pipelines and shows you can think about real-world system performance.

Self-Check

"What if we added more complex drift checks that compare pairs of features? How would the time complexity change?"

Practice

(1/5)
1. What is the main purpose of Evidently AI in ML model monitoring?
easy
A. To clean and preprocess raw data before training
B. To train new machine learning models automatically
C. To deploy ML models to production environments
D. To compare old and new data or predictions to detect changes

Solution

  1. Step 1: Understand Evidently AI's role

    Evidently AI is designed to monitor ML models by checking if data or predictions have changed over time.
  2. Step 2: Identify the main function

    It compares old and new data or predictions to detect data drift or performance issues.
  3. Final Answer:

    To compare old and new data or predictions to detect changes -> Option D
  4. Quick Check:

    Monitoring = Comparing data changes [OK]
Hint: Evidently AI checks data changes, not training or deployment [OK]
Common Mistakes:
  • Confusing monitoring with training models
  • Thinking Evidently deploys models
  • Assuming it preprocesses data
2. Which of the following is the correct way to create an Evidently dashboard with tabs for data drift and performance?
easy
A. dashboard = Dashboard(tabs=DataDriftTab(), PerformanceTab())
B. dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()])
C. dashboard = Dashboard(tabs=[PerformanceTab, DataDriftTab])
D. dashboard = Dashboard([DataDriftTab(), PerformanceTab()])

Solution

  1. Step 1: Review Evidently dashboard syntax

    The Dashboard class expects a list of tab instances passed as the tabs parameter.
  2. Step 2: Check correct instantiation

    Tabs must be instantiated with parentheses and passed as a list to the tabs argument.
  3. Final Answer:

    dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()]) -> Option B
  4. Quick Check:

    Tabs list with instances = dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()]) [OK]
Hint: Tabs must be instances inside a list assigned to tabs parameter [OK]
Common Mistakes:
  • Passing classes instead of instances
  • Not using a list for tabs
  • Incorrect argument syntax
3. Given the following code snippet, what will be the output type of report.save_html('report.html')?
from evidently.dashboard import Dashboard
from evidently.tabs import DataDriftTab

dashboard = Dashboard(tabs=[DataDriftTab()])
report = dashboard.calculate(reference_data, current_data)
report.save_html('report.html')
medium
A. A new HTML file named 'report.html' is created with the dashboard report
B. An error because save_html() returns a string, not a file
C. The report is printed to the console instead of saved
D. Nothing happens because save_html() is not a valid method

Solution

  1. Step 1: Understand save_html() method

    The save_html() method saves the dashboard report as an HTML file to the given path.
  2. Step 2: Analyze the code behavior

    Calling report.save_html('report.html') creates a file named 'report.html' containing the report content.
  3. Final Answer:

    A new HTML file named 'report.html' is created with the dashboard report -> Option A
  4. Quick Check:

    save_html() saves file = A new HTML file named 'report.html' is created with the dashboard report [OK]
Hint: save_html() writes an HTML file, does not print or error [OK]
Common Mistakes:
  • Thinking save_html() returns a string
  • Expecting console output instead of file
  • Assuming save_html() is invalid
4. You wrote this code but get an error: TypeError: 'Dashboard' object is not callable. What is the likely cause?
dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()])
report = dashboard(reference_data, current_data)
medium
A. You forgot to import DataDriftTab
B. Tabs must be strings, not instances
C. You should call dashboard.calculate() instead of dashboard()
D. You need to instantiate PerformanceTab with parameters

Solution

  1. Step 1: Identify the error cause

    The error says Dashboard object is not callable, meaning dashboard() is invalid syntax.
  2. Step 2: Correct method to generate report

    To get a report, you must call dashboard.calculate(reference_data, current_data), not dashboard().
  3. Final Answer:

    You should call dashboard.calculate() instead of dashboard() -> Option C
  4. Quick Check:

    Use calculate() method to get report [OK]
Hint: Dashboard object is not callable means missing .calculate() [OK]
Common Mistakes:
  • Calling dashboard() directly instead of .calculate()
  • Assuming tabs must be strings
  • Thinking imports cause this error
5. You want to monitor a model's prediction quality over time using Evidently AI. Which combination of tabs should you include in your dashboard to track data drift and model performance together?
hard
A. DataDriftTab and ClassificationPerformanceTab
B. DataQualityTab and RegressionPerformanceTab
C. DataDriftTab and DataQualityTab
D. RegressionPerformanceTab and ClassificationPerformanceTab

Solution

  1. Step 1: Identify tabs for data drift and performance

    DataDriftTab monitors changes in input data distribution. ClassificationPerformanceTab tracks model prediction quality for classification tasks.
  2. Step 2: Choose correct combination for monitoring

    To monitor both data drift and model performance for classification, use DataDriftTab and ClassificationPerformanceTab together.
  3. Final Answer:

    DataDriftTab and ClassificationPerformanceTab -> Option A
  4. Quick Check:

    Data drift + classification performance = DataDriftTab and ClassificationPerformanceTab [OK]
Hint: Match tabs to task: DataDrift + ClassificationPerformance for classification [OK]
Common Mistakes:
  • Mixing performance tabs for regression with classification
  • Using DataQualityTab instead of DataDriftTab
  • Choosing two performance tabs without data drift