Experiment - Text preprocessing (tokenization, stemming, lemmatization)
Problem:You have a text classification model that uses raw text data. The model's accuracy is low because the text is not preprocessed properly. Words like 'running', 'runs', and 'ran' are treated as different words, confusing the model.
Current Metrics:Training accuracy: 65%, Validation accuracy: 60%
Issue:The model suffers from low accuracy due to inconsistent word forms and noisy text input. No tokenization, stemming, or lemmatization is applied.