NlpHow-ToBeginner · 3 min read

How to Use NLTK Concordance in NLP: Simple Guide

Use the concordance() method from NLTK's Text class to find all occurrences of a word and see its surrounding context in a text. First, tokenize your text, create an NLTK Text object, then call concordance('word') to display matches with context.

📐

Syntax

The concordance() method is called on an NLTK Text object. It takes a single argument, the word you want to search for, and prints all occurrences with surrounding words.

Text.concordance(word): Finds and displays all contexts of word in the text.

python

from nltk.text import Text

text = Text(['this', 'is', 'a', 'sample', 'text', 'with', 'sample', 'words'])
text.concordance('sample')

Output

Displaying 2 of 2 matches: sample text with sample words

💻

Example

This example shows how to tokenize a sample sentence, create an NLTK Text object, and use concordance() to find the word 'sample' with its context.

python

import nltk
from nltk.text import Text

# Sample sentence
sentence = 'This is a sample sentence to demonstrate NLTK concordance functionality.'

# Tokenize the sentence
tokens = nltk.word_tokenize(sentence)

# Create Text object
text_obj = Text(tokens)

# Use concordance to find 'sample'
text_obj.concordance('sample')

Output

Displaying 1 of 1 matches: sample sentence to demonstrate NLTK concordance functionality .

⚠️

Common Pitfalls

1. Forgetting to tokenize the text before creating the Text object will cause errors or no results.
2. Using concordance() on raw strings instead of an NLTK Text object will not work.
3. The search word is case-sensitive by default, so searching for 'Sample' won't find 'sample'.

To fix case sensitivity, convert tokens to lowercase before creating the Text object.

python

import nltk
from nltk.text import Text

sentence = 'Sample text with sample words.'
tokens = nltk.word_tokenize(sentence)

# Wrong: case-sensitive search
text_obj = Text(tokens)
text_obj.concordance('Sample')  # Finds 'Sample'
text_obj.concordance('sample')  # Finds nothing

# Right: lowercase tokens
tokens_lower = [t.lower() for t in tokens]
text_obj_lower = Text(tokens_lower)
text_obj_lower.concordance('sample')  # Finds both

Output

Displaying 1 of 1 matches: Sample text with sample words . Displaying 2 of 2 matches: sample text with sample words .

📊

Quick Reference

Method	Description
Text.concordance(word)	Show all occurrences of 'word' with context
Text.similar(word)	Show words used in similar contexts
Text.common_contexts([words])	Show common contexts for given words
Text.tokens	Access the list of tokens in the text

✅

Key Takeaways

Use NLTK's Text.concordance(word) to find all contexts of a word in tokenized text.

Always tokenize your text before creating the Text object for concordance to work.

Concordance search is case-sensitive; convert tokens to lowercase to avoid missing matches.

Concordance helps explore how words are used in sentences by showing surrounding words.

NLTK Text class offers other useful methods like similar() and common_contexts() for context analysis.