Challenge - 5 Problems
Stemming Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of Porter Stemmer on a word list
What is the output list after applying the Porter Stemmer to the words ['running', 'jumps', 'easily', 'fairly']?
NLP
from nltk.stem import PorterStemmer ps = PorterStemmer() words = ['running', 'jumps', 'easily', 'fairly'] stemmed = [ps.stem(word) for word in words] print(stemmed)
Attempts:
2 left
💡 Hint
Porter Stemmer often removes suffixes like 'ing', 's', and changes 'y' endings.
✗ Incorrect
Porter Stemmer reduces words to their root form by removing common suffixes. 'running' becomes 'run', 'jumps' becomes 'jump', 'easily' becomes 'easili', and 'fairly' becomes 'fairli'.
🧠 Conceptual
intermediate1:30remaining
Difference between Porter and Snowball Stemmer
Which statement correctly describes a key difference between the Porter Stemmer and the Snowball Stemmer?
Attempts:
2 left
💡 Hint
Think about language support and code design improvements.
✗ Incorrect
Snowball Stemmer was designed as an improved version of Porter Stemmer with clearer rules and support for many languages, unlike Porter which is older and English-only.
❓ Metrics
advanced2:00remaining
Evaluating stemming impact on text classification accuracy
You train a text classifier on raw text and then on stemmed text using Porter Stemmer. The accuracy on test data changes from 82% to 79%. What is the most likely explanation?
Attempts:
2 left
💡 Hint
Consider how stemming affects word forms and model learning.
✗ Incorrect
Stemming reduces word forms to roots, which can reduce vocabulary and noise but may also remove meaningful differences, sometimes slightly lowering accuracy.
🔧 Debug
advanced1:30remaining
Identifying error in Snowball Stemmer usage
What error will this code raise?
from nltk.stem import SnowballStemmer
stemmer = SnowballStemmer('english')
print(stemmer.stem(123))
Attempts:
2 left
💡 Hint
Check the input type expected by stem() method.
✗ Incorrect
SnowballStemmer.stem() expects a string input. Passing an integer causes a TypeError.
❓ Model Choice
expert2:30remaining
Choosing stemming method for multilingual text preprocessing
You have a dataset with English, Spanish, and French texts. Which stemming approach is best to preprocess this data before training a model?
Attempts:
2 left
💡 Hint
Consider language support in stemming tools.
✗ Incorrect
Snowball Stemmer supports multiple languages and can be applied per language, making it suitable for multilingual datasets. Porter Stemmer only supports English.