Practice

(1/5)

1. What does topic coherence measure in topic modeling?

easy

A. How understandable and meaningful the topics are

B. The speed of the model training

C. The number of topics generated

D. The size of the dataset used

Solution

Step 1: Understand the purpose of topic coherence
Topic coherence measures how well the words in a topic relate to each other and make sense together.
Step 2: Compare options to definition
Only How understandable and meaningful the topics are describes this meaning, while others talk about unrelated aspects like speed or dataset size.
Final Answer:
How understandable and meaningful the topics are -> Option A
Quick Check:
Topic coherence = Understandability [OK]

Hint: Coherence = topic clarity and meaning [OK]

Common Mistakes:

Confusing coherence with model speed
Thinking coherence counts topics
Mixing coherence with dataset size

2. Which Python library is commonly used to calculate topic coherence?

easy

A. NumPy

B. Gensim

C. Matplotlib

D. Pandas

Solution

Step 1: Recall libraries for NLP topic modeling
Gensim is a popular library for topic modeling and includes coherence calculation tools.
Step 2: Eliminate unrelated libraries
NumPy is for math, Matplotlib for plotting, Pandas for data frames, none calculate coherence directly.
Final Answer:
Gensim -> Option B
Quick Check:
Coherence calculation library = Gensim [OK]

Hint: Gensim handles topic coherence easily [OK]

Common Mistakes:

Choosing NumPy for coherence
Confusing plotting with coherence calculation
Picking Pandas for topic modeling

3. Given this code snippet, what is the output type of coherence_score?

from gensim.models import CoherenceModel
coherence_model = CoherenceModel(model=lda_model, texts=tokenized_texts, dictionary=dictionary, coherence='c_v')
coherence_score = coherence_model.get_coherence()

medium

A. A string describing the model

B. A list of topic words

C. A dictionary of topic counts

D. A float number representing coherence score

Solution

Step 1: Understand CoherenceModel.get_coherence()
This method returns a single float value that measures the coherence score of the topic model.
Step 2: Check other options
It does not return lists, dictionaries, or strings describing the model.
Final Answer:
A float number representing coherence score -> Option D
Quick Check:
get_coherence() returns float score [OK]

Hint: get_coherence() returns a float score [OK]

Common Mistakes:

Expecting a list of words instead of a score
Thinking it returns a dictionary
Confusing output with model description

4. Identify the error in this code for calculating topic coherence:

coherence_model = CoherenceModel(model=lda_model, texts=tokenized_texts, coherence='c_v')
score = coherence_model.get_coherence()

medium

A. Incorrect method name get_coherence_score()

B. texts parameter should be a string, not list

C. Missing dictionary parameter in CoherenceModel

D. Model parameter should be a string, not lda_model

Solution

Step 1: Check required parameters for CoherenceModel
The dictionary parameter is required to map words to ids for coherence calculation.
Step 2: Verify method and parameter types
get_coherence() is correct method; texts should be list of tokenized texts; model is correctly passed as lda_model.
Final Answer:
Missing dictionary parameter in CoherenceModel -> Option C
Quick Check:
Dictionary missing causes error [OK]

Hint: Always include dictionary when using CoherenceModel [OK]

Common Mistakes:

Using wrong method name
Passing texts as string instead of list
Passing model as string instead of object

5. You have two topic models with coherence scores 0.35 and 0.55. What should you do to improve the model with 0.35 coherence?

hard

A. Increase the number of topics and recalculate coherence

B. Reduce the dataset size to speed up training

C. Ignore coherence and pick the model with fewer topics

D. Change the coherence measure to 'u_mass' without retraining

Solution

Step 1: Understand coherence score meaning
A higher coherence score means better topic quality and interpretability.
Step 2: Improve model by adjusting topics
Increasing or tuning the number of topics can improve coherence by better capturing themes.
Step 3: Evaluate other options
Reducing dataset size or ignoring coherence won't improve quality; changing measure without retraining is ineffective.
Final Answer:
Increase the number of topics and recalculate coherence -> Option A
Quick Check:
Better coherence = tune topics [OK]

Hint: Tune topic count to improve coherence [OK]

Common Mistakes:

Ignoring coherence scores
Changing measure without retraining
Reducing data size instead of improving model

Epoch	Loss ↓	Accuracy ↑	Observation
1	0.85	N/A	Initial topic model training with random initialization
2	0.65	N/A	Topics start to form meaningful word groups
3	0.5	N/A	Coherence scores improve as topics become clearer
4	0.45	N/A	Loss decreases steadily, topics stabilize
5	0.43	N/A	Final epoch with best coherence scores

Topic coherence evaluation in NLP - Model Pipeline Trace

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of topic coherence

Step 2: Compare options to definition

Final Answer:

Quick Check:

Solution

Step 1: Recall libraries for NLP topic modeling

Step 2: Eliminate unrelated libraries

Final Answer:

Quick Check:

Solution

Step 1: Understand CoherenceModel.get_coherence()

Step 2: Check other options

Final Answer:

Quick Check:

Solution

Step 1: Check required parameters for CoherenceModel

Step 2: Verify method and parameter types

Final Answer:

Quick Check:

Solution

Step 1: Understand coherence score meaning

Step 2: Improve model by adjusting topics

Step 3: Evaluate other options

Final Answer:

Quick Check: