How to do POS tagging python in nlp

NlpHow-ToBeginner · 3 min read

POS Tagging in Python for NLP: Simple Guide with Examples

You can do POS tagging in Python using the nltk library by first tokenizing text with word_tokenize and then applying pos_tag to get word tags. This process labels each word with its part of speech like noun, verb, or adjective.

📐

Syntax

POS tagging in Python with NLTK involves two main steps:

word_tokenize(text): splits the text into words (tokens).
pos_tag(tokens): assigns a POS tag to each token.

The output is a list of tuples where each tuple contains a word and its POS tag.

python

from nltk import word_tokenize, pos_tag

text = "I love learning NLP."
tokens = word_tokenize(text)
pos_tags = pos_tag(tokens)
print(pos_tags)

Output

[('I', 'PRP'), ('love', 'VBP'), ('learning', 'VBG'), ('NLP', 'NNP'), ('.', '.')]

💻

Example

This example shows how to tokenize a sentence and get POS tags for each word using NLTK.

python

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

from nltk import word_tokenize, pos_tag

sentence = "Python is great for natural language processing."
tokens = word_tokenize(sentence)
pos_tags = pos_tag(tokens)
print(pos_tags)

Output

[('Python', 'NNP'), ('is', 'VBZ'), ('great', 'JJ'), ('for', 'IN'), ('natural', 'JJ'), ('language', 'NN'), ('processing', 'NN'), ('.', '.')]

⚠️

Common Pitfalls

Common mistakes when doing POS tagging include:

Not tokenizing text before tagging, which causes errors.
Forgetting to download required NLTK data packages like punkt and averaged_perceptron_tagger.
Assuming POS tags are full words instead of short codes (e.g., NN means noun).

python

import nltk

# Wrong: tagging raw text without tokenizing
try:
    print(nltk.pos_tag("This is wrong"))
except Exception as e:
    print(f"Error: {e}")

# Right: tokenize first
from nltk import word_tokenize, pos_tag
text = "This is correct"
tokens = word_tokenize(text)
print(pos_tag(tokens))

Output

Error: expected string or bytes-like object [('This', 'DT'), ('is', 'VBZ'), ('correct', 'JJ')]

📊

Quick Reference

POS tag examples from NLTK's tagset:

POS Tag	Meaning
NN	Noun, singular
NNS	Noun, plural
VB	Verb, base form
VBD	Verb, past tense
JJ	Adjective
RB	Adverb
PRP	Personal pronoun
IN	Preposition or subordinating conjunction
.	Punctuation

✅

Key Takeaways

Always tokenize text before POS tagging using word_tokenize.

Use nltk.pos_tag to get part-of-speech tags for each token.

Download required NLTK data packages before running POS tagging.

POS tags are short codes representing word types, not full words.

Common errors come from skipping tokenization or missing downloads.