Bird
Raised Fist0
MLOpsdevops~30 mins

Evidently AI for monitoring in MLOps - Mini Project: Build & Apply

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Evidently AI for Monitoring
📖 Scenario: You work as a data engineer in a company that runs machine learning models in production. Your team wants to monitor the quality of predictions over time to catch any problems early. You will use Evidently AI, a tool that helps track data and model quality metrics easily.
🎯 Goal: Build a simple Python script that loads sample prediction data, configures Evidently AI to monitor data drift, runs the monitoring analysis, and prints the report summary.
📋 What You'll Learn
Create a pandas DataFrame called reference_data with sample features and predictions.
Create a configuration variable called data_drift_report using Evidently AI's Report class with DataDriftProfile.
Run the run method on data_drift_report with reference_data as both reference and current data.
Print the JSON summary of the report using data_drift_report.json().
💡 Why This Matters
🌍 Real World
Monitoring machine learning models in production helps catch data or model quality issues early, preventing bad decisions or degraded user experience.
💼 Career
Data engineers and MLOps engineers use tools like Evidently AI to automate monitoring and ensure ML models stay reliable over time.
Progress0 / 4 steps
1
Create sample prediction data
Create a pandas DataFrame called reference_data with these exact columns and values: 'feature1' with values [10, 20, 30, 40, 50], 'feature2' with values [1, 2, 3, 4, 5], and 'prediction' with values [0, 1, 0, 1, 0].
MLOps
Hint

Use pd.DataFrame with a dictionary of lists for columns.

2
Configure Evidently AI data drift report
Import Report and DataDriftProfile from evidently. Then create a variable called data_drift_report and assign it a Report instance with the profile set to [DataDriftProfile()].
MLOps
Hint

Use Report with profile=[DataDriftProfile()] to monitor data drift.

3
Run the data drift analysis
Use the run method on data_drift_report with reference_data as both the reference_data and current_data arguments.
MLOps
Hint

Call run on data_drift_report with the same DataFrame for both arguments.

4
Print the data drift report summary
Print the JSON summary of the data drift report by calling print(data_drift_report.json()).
MLOps
Hint

Use print(data_drift_report.json()) to show the report summary in JSON format.

Practice

(1/5)
1. What is the main purpose of Evidently AI in ML model monitoring?
easy
A. To clean and preprocess raw data before training
B. To train new machine learning models automatically
C. To deploy ML models to production environments
D. To compare old and new data or predictions to detect changes

Solution

  1. Step 1: Understand Evidently AI's role

    Evidently AI is designed to monitor ML models by checking if data or predictions have changed over time.
  2. Step 2: Identify the main function

    It compares old and new data or predictions to detect data drift or performance issues.
  3. Final Answer:

    To compare old and new data or predictions to detect changes -> Option D
  4. Quick Check:

    Monitoring = Comparing data changes [OK]
Hint: Evidently AI checks data changes, not training or deployment [OK]
Common Mistakes:
  • Confusing monitoring with training models
  • Thinking Evidently deploys models
  • Assuming it preprocesses data
2. Which of the following is the correct way to create an Evidently dashboard with tabs for data drift and performance?
easy
A. dashboard = Dashboard(tabs=DataDriftTab(), PerformanceTab())
B. dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()])
C. dashboard = Dashboard(tabs=[PerformanceTab, DataDriftTab])
D. dashboard = Dashboard([DataDriftTab(), PerformanceTab()])

Solution

  1. Step 1: Review Evidently dashboard syntax

    The Dashboard class expects a list of tab instances passed as the tabs parameter.
  2. Step 2: Check correct instantiation

    Tabs must be instantiated with parentheses and passed as a list to the tabs argument.
  3. Final Answer:

    dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()]) -> Option B
  4. Quick Check:

    Tabs list with instances = dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()]) [OK]
Hint: Tabs must be instances inside a list assigned to tabs parameter [OK]
Common Mistakes:
  • Passing classes instead of instances
  • Not using a list for tabs
  • Incorrect argument syntax
3. Given the following code snippet, what will be the output type of report.save_html('report.html')?
from evidently.dashboard import Dashboard
from evidently.tabs import DataDriftTab

dashboard = Dashboard(tabs=[DataDriftTab()])
report = dashboard.calculate(reference_data, current_data)
report.save_html('report.html')
medium
A. A new HTML file named 'report.html' is created with the dashboard report
B. An error because save_html() returns a string, not a file
C. The report is printed to the console instead of saved
D. Nothing happens because save_html() is not a valid method

Solution

  1. Step 1: Understand save_html() method

    The save_html() method saves the dashboard report as an HTML file to the given path.
  2. Step 2: Analyze the code behavior

    Calling report.save_html('report.html') creates a file named 'report.html' containing the report content.
  3. Final Answer:

    A new HTML file named 'report.html' is created with the dashboard report -> Option A
  4. Quick Check:

    save_html() saves file = A new HTML file named 'report.html' is created with the dashboard report [OK]
Hint: save_html() writes an HTML file, does not print or error [OK]
Common Mistakes:
  • Thinking save_html() returns a string
  • Expecting console output instead of file
  • Assuming save_html() is invalid
4. You wrote this code but get an error: TypeError: 'Dashboard' object is not callable. What is the likely cause?
dashboard = Dashboard(tabs=[DataDriftTab(), PerformanceTab()])
report = dashboard(reference_data, current_data)
medium
A. You forgot to import DataDriftTab
B. Tabs must be strings, not instances
C. You should call dashboard.calculate() instead of dashboard()
D. You need to instantiate PerformanceTab with parameters

Solution

  1. Step 1: Identify the error cause

    The error says Dashboard object is not callable, meaning dashboard() is invalid syntax.
  2. Step 2: Correct method to generate report

    To get a report, you must call dashboard.calculate(reference_data, current_data), not dashboard().
  3. Final Answer:

    You should call dashboard.calculate() instead of dashboard() -> Option C
  4. Quick Check:

    Use calculate() method to get report [OK]
Hint: Dashboard object is not callable means missing .calculate() [OK]
Common Mistakes:
  • Calling dashboard() directly instead of .calculate()
  • Assuming tabs must be strings
  • Thinking imports cause this error
5. You want to monitor a model's prediction quality over time using Evidently AI. Which combination of tabs should you include in your dashboard to track data drift and model performance together?
hard
A. DataDriftTab and ClassificationPerformanceTab
B. DataQualityTab and RegressionPerformanceTab
C. DataDriftTab and DataQualityTab
D. RegressionPerformanceTab and ClassificationPerformanceTab

Solution

  1. Step 1: Identify tabs for data drift and performance

    DataDriftTab monitors changes in input data distribution. ClassificationPerformanceTab tracks model prediction quality for classification tasks.
  2. Step 2: Choose correct combination for monitoring

    To monitor both data drift and model performance for classification, use DataDriftTab and ClassificationPerformanceTab together.
  3. Final Answer:

    DataDriftTab and ClassificationPerformanceTab -> Option A
  4. Quick Check:

    Data drift + classification performance = DataDriftTab and ClassificationPerformanceTab [OK]
Hint: Match tabs to task: DataDrift + ClassificationPerformance for classification [OK]
Common Mistakes:
  • Mixing performance tabs for regression with classification
  • Using DataQualityTab instead of DataDriftTab
  • Choosing two performance tabs without data drift