We use topic visualization to see what main ideas a computer found in a bunch of text. It helps us understand and explain the topics better.
0
0
Visualizing topics (pyLDAvis) in NLP
Introduction
You want to explore what themes appear in customer reviews.
You need to explain topics found in news articles to your team.
You want to check if your topic model grouped words well.
You want to compare topics from different sets of documents.
Syntax
NLP
import pyLDAvis import pyLDAvis.sklearn # Prepare visualization data vis_data = pyLDAvis.sklearn.prepare(lda_model, dtm, vectorizer) # Show visualization in notebook or save as html pyLDAvis.display(vis_data) # or pyLDAvis.save_html(vis_data, 'lda_vis.html')
lda_model is your trained topic model.
dtm is the document-term matrix used for training.
Examples
Visualize topics directly in a Jupyter notebook.
NLP
import pyLDAvis import pyLDAvis.sklearn vis_data = pyLDAvis.sklearn.prepare(lda, dtm, vectorizer) pyLDAvis.display(vis_data)
Save the interactive visualization as an HTML file to open in a browser later.
NLP
pyLDAvis.save_html(vis_data, 'topics.html')Sample Model
This code trains a simple topic model on a few sentences, then creates and saves an interactive visualization showing the topics and important words.
NLP
from sklearn.feature_extraction.text import CountVectorizer from sklearn.decomposition import LatentDirichletAllocation import pyLDAvis import pyLDAvis.sklearn # Sample documents texts = [ 'I love reading books about science and technology.', 'The new movie was fantastic and thrilling.', 'Technology advances help science grow.', 'Movies and books are great entertainment.', 'Science and technology are closely related fields.' ] # Convert texts to document-term matrix vectorizer = CountVectorizer(stop_words='english') dtm = vectorizer.fit_transform(texts) # Train LDA model with 2 topics lda = LatentDirichletAllocation(n_components=2, random_state=42) lda.fit(dtm) # Prepare visualization vis_data = pyLDAvis.sklearn.prepare(lda, dtm, vectorizer) # Save visualization to HTML pyLDAvis.save_html(vis_data, 'lda_visualization.html') print('LDA visualization saved as lda_visualization.html')
OutputSuccess
Important Notes
pyLDAvis works best with models trained on CountVectorizer or similar.
Open the saved HTML file in a web browser to explore topics interactively.
Each bubble in the visualization shows a topic; bigger means more common.
Summary
pyLDAvis helps you see and understand topics found in text data.
You prepare data from your model and then display or save the visualization.
Interactive visuals make it easier to explain what your topic model learned.