0
0
NLPml~5 mins

Visualizing topics (pyLDAvis) in NLP

Choose your learning style9 modes available
Introduction

We use topic visualization to see what main ideas a computer found in a bunch of text. It helps us understand and explain the topics better.

You want to explore what themes appear in customer reviews.
You need to explain topics found in news articles to your team.
You want to check if your topic model grouped words well.
You want to compare topics from different sets of documents.
Syntax
NLP
import pyLDAvis
import pyLDAvis.sklearn

# Prepare visualization data
vis_data = pyLDAvis.sklearn.prepare(lda_model, dtm, vectorizer)

# Show visualization in notebook or save as html
pyLDAvis.display(vis_data)
# or
pyLDAvis.save_html(vis_data, 'lda_vis.html')

lda_model is your trained topic model.

dtm is the document-term matrix used for training.

Examples
Visualize topics directly in a Jupyter notebook.
NLP
import pyLDAvis
import pyLDAvis.sklearn
vis_data = pyLDAvis.sklearn.prepare(lda, dtm, vectorizer)
pyLDAvis.display(vis_data)
Save the interactive visualization as an HTML file to open in a browser later.
NLP
pyLDAvis.save_html(vis_data, 'topics.html')
Sample Model

This code trains a simple topic model on a few sentences, then creates and saves an interactive visualization showing the topics and important words.

NLP
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation
import pyLDAvis
import pyLDAvis.sklearn

# Sample documents
texts = [
    'I love reading books about science and technology.',
    'The new movie was fantastic and thrilling.',
    'Technology advances help science grow.',
    'Movies and books are great entertainment.',
    'Science and technology are closely related fields.'
]

# Convert texts to document-term matrix
vectorizer = CountVectorizer(stop_words='english')
dtm = vectorizer.fit_transform(texts)

# Train LDA model with 2 topics
lda = LatentDirichletAllocation(n_components=2, random_state=42)
lda.fit(dtm)

# Prepare visualization
vis_data = pyLDAvis.sklearn.prepare(lda, dtm, vectorizer)

# Save visualization to HTML
pyLDAvis.save_html(vis_data, 'lda_visualization.html')

print('LDA visualization saved as lda_visualization.html')
OutputSuccess
Important Notes

pyLDAvis works best with models trained on CountVectorizer or similar.

Open the saved HTML file in a web browser to explore topics interactively.

Each bubble in the visualization shows a topic; bigger means more common.

Summary

pyLDAvis helps you see and understand topics found in text data.

You prepare data from your model and then display or save the visualization.

Interactive visuals make it easier to explain what your topic model learned.