NlpHow-ToBeginner · 3 min read

How to Use NLTK Stemmer in NLP for Text Processing

To use an NLTK stemmer in NLP, first import a stemmer like PorterStemmer from nltk.stem. Then create a stemmer object and call its stem() method on words to get their root forms, which helps simplify text analysis.

📐

Syntax

The basic syntax to use an NLTK stemmer involves importing the stemmer class, creating an instance, and applying the stem() method to words.

from nltk.stem import PorterStemmer: imports the Porter stemmer class.
stemmer = PorterStemmer(): creates a stemmer object.
stemmer.stem(word): returns the stemmed form of the input word.

python

from nltk.stem import PorterStemmer

stemmer = PorterStemmer()
word = 'running'
stemmed_word = stemmer.stem(word)
print(stemmed_word)

Output

run

💻

Example

This example shows how to stem a list of words using NLTK's PorterStemmer. It demonstrates how different word forms reduce to the same root.

python

from nltk.stem import PorterStemmer

words = ['running', 'runs', 'runner', 'easily', 'fairly']
stemmer = PorterStemmer()
stemmed_words = [stemmer.stem(word) for word in words]
print(stemmed_words)

Output

['run', 'run', 'runner', 'easili', 'fairli']

⚠️

Common Pitfalls

Common mistakes include:

Not installing or importing NLTK properly before use.
Confusing stemming with lemmatization; stemming cuts words roughly, which can cause non-words.
Applying stemmer to sentences without splitting into words first.

Always tokenize text into words before stemming.

python

from nltk.stem import PorterStemmer

# Wrong: stemming a sentence string directly
stemmer = PorterStemmer()
sentence = 'He is running fast'
# This will treat the whole sentence as one word
print(stemmer.stem(sentence))

# Right: tokenize first, then stem each word
words = sentence.split()
stemmed = [stemmer.stem(word) for word in words]
print(stemmed)

Output

He is running fast ['He', 'is', 'run', 'fast']

📊

Quick Reference

Step	Description	Code Example
Import Stemmer	Import the stemmer class from nltk.stem	from nltk.stem import PorterStemmer
Create Stemmer	Make an instance of the stemmer	stemmer = PorterStemmer()
Stem Word	Apply stem() method to a word	stemmer.stem('running') # returns 'run'
Stem List	Stem multiple words using list comprehension	[stemmer.stem(w) for w in words]

✅

Key Takeaways

Import and create an NLTK stemmer object before stemming words.

Use the stem() method on individual words, not full sentences.

Stemming reduces words to root forms but may produce non-words.

Always tokenize text into words before applying stemming.

PorterStemmer is a common choice for English stemming in NLTK.