Experiment - Monitoring NLP models
Problem:You have a text classification NLP model deployed to classify customer reviews as positive or negative. The model was trained well, but after deployment, its performance might degrade over time due to changes in language or topics.
Current Metrics:Training accuracy: 92%, Validation accuracy: 88%, Current deployed model accuracy on recent data: 75%
Issue:The model shows signs of performance degradation (accuracy dropped from 88% validation to 75% on recent data), indicating possible data drift or concept drift.