Challenge - 5 Problems

🎖️

Stemming Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of Porter Stemmer on a word list

What is the output list after applying the Porter Stemmer to the words ['running', 'jumps', 'easily', 'fairly']?

NLP

from nltk.stem import PorterStemmer
ps = PorterStemmer()
words = ['running', 'jumps', 'easily', 'fairly']
stemmed = [ps.stem(word) for word in words]
print(stemmed)

A['run', 'jump', 'easily', 'fairly']

B['running', 'jump', 'easily', 'fairly']

C['run', 'jumps', 'easili', 'fairly']

D['run', 'jump', 'easili', 'fairli']

Attempts:

2 left

🧠 Conceptual

intermediate

1:30remaining

Difference between Porter and Snowball Stemmer

Which statement correctly describes a key difference between the Porter Stemmer and the Snowball Stemmer?

ASnowball Stemmer always produces longer stems than Porter Stemmer.

BPorter Stemmer supports multiple languages while Snowball only supports English.

CSnowball Stemmer is a newer, more readable and flexible version of Porter Stemmer supporting multiple languages.

DPorter Stemmer uses machine learning while Snowball uses rule-based stemming.

Attempts:

2 left

❓ Metrics

advanced

2:00remaining

Evaluating stemming impact on text classification accuracy

You train a text classifier on raw text and then on stemmed text using Porter Stemmer. The accuracy on test data changes from 82% to 79%. What is the most likely explanation?

AStemming always improves accuracy, so this must be a bug in the code.

BStemming reduced vocabulary size but also removed useful distinctions, slightly lowering accuracy.

CThe test data was stemmed but training data was not, causing mismatch and accuracy drop.

DPorter Stemmer introduced spelling errors that confused the classifier.

Attempts:

2 left

🔧 Debug

advanced

1:30remaining

Identifying error in Snowball Stemmer usage

What error will this code raise? from nltk.stem import SnowballStemmer stemmer = SnowballStemmer('english') print(stemmer.stem(123))

ATypeError because stem() expects a string, not an integer

BAttributeError because integers have no lower() method

CValueError because 'english' is not a valid language

DNo error, outputs '123'

Attempts:

2 left

❓ Model Choice

expert

2:30remaining

Choosing stemming method for multilingual text preprocessing

You have a dataset with English, Spanish, and French texts. Which stemming approach is best to preprocess this data before training a model?

AUse Snowball Stemmer specifying the language for each text before stemming

BUse Porter Stemmer on all texts regardless of language

CUse a custom rule-based stemmer designed only for English

DUse no stemming and rely on raw text for all languages

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of stemming in Natural Language Processing?

easy

A. To reduce words to their base or root form

B. To translate text into another language

C. To count the number of words in a sentence

D. To generate synonyms for words

Stemming (Porter, Snowball) in NLP - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand stemming concept

Step 2: Compare options with stemming goal

Final Answer:

Quick Check:

Solution

Step 1: Recall correct import syntax in Python

Step 2: Match with NLTK Porter Stemmer import

Final Answer:

Quick Check:

Solution

Step 1: Apply Porter Stemmer to each word

Step 2: List the stemmed results

Final Answer:

Quick Check:

Solution

Step 1: Check SnowballStemmer import and usage

Step 2: Verify method call and output

Final Answer:

Quick Check:

Solution

Step 1: Understand the condition for stemming

Step 2: Check list comprehension syntax

Final Answer:

Quick Check: