Experiment - Lemmatization in spaCy
Problem:You want to convert words in sentences to their base forms (lemmas) using spaCy. Currently, your code extracts lemmas but sometimes includes punctuation and stop words, which makes the output noisy.
Current Metrics:Accuracy of lemma extraction: 85% (manually checked on sample sentences). Output includes unwanted tokens like punctuation and stop words.
Issue:The model extracts lemmas correctly but does not filter out punctuation and stop words, reducing the quality of the lemmatized output.