0
0
NLPml~10 mins

Topic coherence evaluation in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to calculate the coherence score using Gensim's CoherenceModel.

NLP
from gensim.models.coherencemodel import CoherenceModel
coherence_model = CoherenceModel(model=lda_model, texts=tokenized_texts, dictionary=dictionary, coherence='[1]')
coherence_score = coherence_model.get_coherence()
Drag options to blanks, or click blank then click option'
Ac_v
Bc_uci
Cc_npmi
Du_mass
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'u_mass' which requires a corpus in BoW format and is less interpretable.
Confusing coherence types and using one not supported by the model.
2fill in blank
medium

Complete the code to preprocess texts by tokenizing and removing stopwords before coherence evaluation.

NLP
from nltk.corpus import stopwords
stop_words = set(stopwords.words('english'))
processed_texts = [[word for word in doc.lower().split() if word not in [1]] for doc in documents]
Drag options to blanks, or click blank then click option'
Atexts
Bdictionary
Ctokenized_texts
Dstop_words
Attempts:
3 left
💡 Hint
Common Mistakes
Using the dictionary object instead of stopwords set.
Not converting words to lowercase before filtering.
3fill in blank
hard

Fix the error in the code to compute coherence score by correctly passing the dictionary parameter.

NLP
coherence_model = CoherenceModel(model=lda_model, texts=tokenized_texts, [1]=dictionary, coherence='c_v')
score = coherence_model.get_coherence()
Drag options to blanks, or click blank then click option'
Adictionary
Bdict
Ccorpus
Dtokens
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'dict' which is a Python keyword, not a parameter name.
Passing the corpus instead of dictionary.
4fill in blank
hard

Fill both blanks to create a dictionary and corpus needed for topic coherence evaluation.

NLP
from gensim import corpora
[1] = corpora.Dictionary(tokenized_texts)
[2] = [[1].doc2bow(text) for text in tokenized_texts]
Drag options to blanks, or click blank then click option'
Adictionary
Bcorpus
Ctexts
Dtokens
Attempts:
3 left
💡 Hint
Common Mistakes
Swapping dictionary and corpus variable names.
Using undefined variable names.
5fill in blank
hard

Fill all three blanks to compute and print the coherence score for an LDA model.

NLP
coherence_model = CoherenceModel(model=[1], texts=[2], dictionary=[3], coherence='c_v')
score = coherence_model.get_coherence()
print(f"Coherence Score: {score:.4f}")
Drag options to blanks, or click blank then click option'
Alda_model
Btokenized_texts
Cdictionary
Dcorpus
Attempts:
3 left
💡 Hint
Common Mistakes
Using corpus instead of tokenized texts for the texts parameter.
Passing wrong variable names causing runtime errors.