0
0
Ml-pythonHow-ToBeginner ยท 3 min read

How to Use Evidently for Monitoring Machine Learning Models

Use Evidently by creating a report with monitoring metrics like data drift or model performance, then run it on your data periodically to track changes. You can generate HTML reports or dashboards to visualize monitoring results easily.
๐Ÿ“

Syntax

The basic syntax to use Evidently involves importing the library, creating a Report object with desired metrics, and running it on your reference and current datasets. You then generate a report output such as HTML.

  • Report(metrics=[...]): Define what to monitor (e.g., data drift, target drift).
  • report.run(reference_data, current_data): Compare baseline and new data.
  • report.save_html('report.html'): Save the monitoring results as an HTML file.
python
from evidently import Report
from evidently.metric_preset import DataDriftPreset

# Create a report with data drift metrics
report = Report(metrics=[DataDriftPreset()])

# Run report on reference and current datasets
report.run(reference_data=reference_df, current_data=current_df)

# Save report to HTML
report.save_html('monitoring_report.html')
๐Ÿ’ป

Example

This example shows how to monitor data drift between a reference dataset and a new dataset using Evidently. It creates a report and saves it as an HTML file you can open in a browser.

python
import pandas as pd
from evidently import Report
from evidently.metric_preset import DataDriftPreset

# Sample reference data
reference_df = pd.DataFrame({
    'feature1': [1, 2, 3, 4, 5],
    'feature2': [10, 20, 30, 40, 50]
})

# Sample current data with some drift
current_df = pd.DataFrame({
    'feature1': [2, 3, 4, 5, 6],
    'feature2': [15, 25, 35, 45, 55]
})

# Create Evidently report for data drift
report = Report(metrics=[DataDriftPreset()])

# Run the report
report.run(reference_data=reference_df, current_data=current_df)

# Save the report as HTML
report.save_html('data_drift_report.html')

print('Monitoring report saved as data_drift_report.html')
Output
Monitoring report saved as data_drift_report.html
โš ๏ธ

Common Pitfalls

  • Not providing both reference and current data: Evidently needs a baseline (reference) and new data to compare.
  • Using incompatible data formats: Data must be pandas DataFrames with matching columns.
  • Ignoring metric presets: Use built-in presets like DataDriftPreset or define custom metrics properly.
  • Not saving or visualizing reports: Without saving or rendering, you won't see monitoring results.
python
from evidently import Report
from evidently.metric_preset import DataDriftPreset

# Wrong: Missing current_data
report = Report(metrics=[DataDriftPreset()])
try:
    report.run(reference_data=reference_df)
except TypeError as e:
    print(f'Error: {e}')

# Right: Provide both datasets
report.run(reference_data=reference_df, current_data=current_df)
Output
Error: run() missing 1 required positional argument: 'current_data'
๐Ÿ“Š

Quick Reference

StepDescriptionCode snippet
1Import Evidently and metric presetsfrom evidently import Report from evidently.metric_preset import DataDriftPreset
2Create a report with desired metricsreport = Report(metrics=[DataDriftPreset()])
3Run report on reference and current datareport.run(reference_data=reference_df, current_data=current_df)
4Save or display the reportreport.save_html('report.html')
โœ…

Key Takeaways

Evidently compares reference and current data to detect changes like data drift.
Always provide both reference and current datasets as pandas DataFrames.
Use built-in metric presets for easy monitoring setup.
Save reports as HTML to visualize monitoring results in a browser.
Check for common errors like missing data inputs or incompatible formats.