Bird
Raised Fist0
NlpHow-ToBeginner ยท 3 min read

POS Tagging in Python for NLP: Simple Guide with Examples

You can do POS tagging in Python using the nltk library by first tokenizing text with word_tokenize and then applying pos_tag to get word tags. This process labels each word with its part of speech like noun, verb, or adjective.
๐Ÿ“

Syntax

POS tagging in Python with NLTK involves two main steps:

  • word_tokenize(text): splits the text into words (tokens).
  • pos_tag(tokens): assigns a POS tag to each token.

The output is a list of tuples where each tuple contains a word and its POS tag.

python
from nltk import word_tokenize, pos_tag

text = "I love learning NLP."
tokens = word_tokenize(text)
pos_tags = pos_tag(tokens)
print(pos_tags)
Output
[('I', 'PRP'), ('love', 'VBP'), ('learning', 'VBG'), ('NLP', 'NNP'), ('.', '.')]
๐Ÿ’ป

Example

This example shows how to tokenize a sentence and get POS tags for each word using NLTK.

python
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

from nltk import word_tokenize, pos_tag

sentence = "Python is great for natural language processing."
tokens = word_tokenize(sentence)
pos_tags = pos_tag(tokens)
print(pos_tags)
Output
[('Python', 'NNP'), ('is', 'VBZ'), ('great', 'JJ'), ('for', 'IN'), ('natural', 'JJ'), ('language', 'NN'), ('processing', 'NN'), ('.', '.')]
โš ๏ธ

Common Pitfalls

Common mistakes when doing POS tagging include:

  • Not tokenizing text before tagging, which causes errors.
  • Forgetting to download required NLTK data packages like punkt and averaged_perceptron_tagger.
  • Assuming POS tags are full words instead of short codes (e.g., NN means noun).
python
import nltk

# Wrong: tagging raw text without tokenizing
try:
    print(nltk.pos_tag("This is wrong"))
except Exception as e:
    print(f"Error: {e}")

# Right: tokenize first
from nltk import word_tokenize, pos_tag
text = "This is correct"
tokens = word_tokenize(text)
print(pos_tag(tokens))
Output
Error: expected string or bytes-like object [('This', 'DT'), ('is', 'VBZ'), ('correct', 'JJ')]
๐Ÿ“Š

Quick Reference

POS tag examples from NLTK's tagset:

POS TagMeaning
NNNoun, singular
NNSNoun, plural
VBVerb, base form
VBDVerb, past tense
JJAdjective
RBAdverb
PRPPersonal pronoun
INPreposition or subordinating conjunction
.Punctuation
โœ…

Key Takeaways

Always tokenize text before POS tagging using word_tokenize.
Use nltk.pos_tag to get part-of-speech tags for each token.
Download required NLTK data packages before running POS tagging.
POS tags are short codes representing word types, not full words.
Common errors come from skipping tokenization or missing downloads.