How to Use Evidently for Monitoring Machine Learning Models
Use
Evidently by creating a report with monitoring metrics like data drift or model performance, then run it on your data periodically to track changes. You can generate HTML reports or dashboards to visualize monitoring results easily.Syntax
The basic syntax to use Evidently involves importing the library, creating a Report object with desired metrics, and running it on your reference and current datasets. You then generate a report output such as HTML.
Report(metrics=[...]): Define what to monitor (e.g., data drift, target drift).report.run(reference_data, current_data): Compare baseline and new data.report.save_html('report.html'): Save the monitoring results as an HTML file.
python
from evidently import Report from evidently.metric_preset import DataDriftPreset # Create a report with data drift metrics report = Report(metrics=[DataDriftPreset()]) # Run report on reference and current datasets report.run(reference_data=reference_df, current_data=current_df) # Save report to HTML report.save_html('monitoring_report.html')
Example
This example shows how to monitor data drift between a reference dataset and a new dataset using Evidently. It creates a report and saves it as an HTML file you can open in a browser.
python
import pandas as pd from evidently import Report from evidently.metric_preset import DataDriftPreset # Sample reference data reference_df = pd.DataFrame({ 'feature1': [1, 2, 3, 4, 5], 'feature2': [10, 20, 30, 40, 50] }) # Sample current data with some drift current_df = pd.DataFrame({ 'feature1': [2, 3, 4, 5, 6], 'feature2': [15, 25, 35, 45, 55] }) # Create Evidently report for data drift report = Report(metrics=[DataDriftPreset()]) # Run the report report.run(reference_data=reference_df, current_data=current_df) # Save the report as HTML report.save_html('data_drift_report.html') print('Monitoring report saved as data_drift_report.html')
Output
Monitoring report saved as data_drift_report.html
Common Pitfalls
- Not providing both reference and current data: Evidently needs a baseline (reference) and new data to compare.
- Using incompatible data formats: Data must be pandas DataFrames with matching columns.
- Ignoring metric presets: Use built-in presets like
DataDriftPresetor define custom metrics properly. - Not saving or visualizing reports: Without saving or rendering, you won't see monitoring results.
python
from evidently import Report from evidently.metric_preset import DataDriftPreset # Wrong: Missing current_data report = Report(metrics=[DataDriftPreset()]) try: report.run(reference_data=reference_df) except TypeError as e: print(f'Error: {e}') # Right: Provide both datasets report.run(reference_data=reference_df, current_data=current_df)
Output
Error: run() missing 1 required positional argument: 'current_data'
Quick Reference
| Step | Description | Code snippet |
|---|---|---|
| 1 | Import Evidently and metric presets | from evidently import Report from evidently.metric_preset import DataDriftPreset |
| 2 | Create a report with desired metrics | report = Report(metrics=[DataDriftPreset()]) |
| 3 | Run report on reference and current data | report.run(reference_data=reference_df, current_data=current_df) |
| 4 | Save or display the report | report.save_html('report.html') |
Key Takeaways
Evidently compares reference and current data to detect changes like data drift.
Always provide both reference and current datasets as pandas DataFrames.
Use built-in metric presets for easy monitoring setup.
Save reports as HTML to visualize monitoring results in a browser.
Check for common errors like missing data inputs or incompatible formats.