Bird
0
0

Identify the error in this code snippet that uses LDA with scikit-learn:

medium📝 Debug Q14 of 15
NLP - Topic Modeling
Identify the error in this code snippet that uses LDA with scikit-learn:
from sklearn.decomposition import LatentDirichletAllocation
from sklearn.feature_extraction.text import CountVectorizer

docs = ["cat dog", "dog mouse", "cat mouse"]
vectorizer = CountVectorizer()
dtm = vectorizer.fit_transform(docs)
lda = LatentDirichletAllocation(n_components=2)
lda.fit_transform(dtm)
print(lda.components_)
Alda.fit_transform returns a matrix but the code ignores it
BCountVectorizer should be replaced with TfidfVectorizer
Clda.components_ attribute does not exist
Dn_components must be equal to number of documents
Step-by-Step Solution
Solution:
  1. Step 1: Check usage of fit_transform

    lda.fit_transform returns the topic distribution matrix, but the code does not store or use this output.
  2. Step 2: Verify attribute and parameters

    lda.components_ exists and n_components can be any positive integer. CountVectorizer is valid here.
  3. Final Answer:

    lda.fit_transform returns a matrix but the code ignores it -> Option A
  4. Quick Check:

    fit_transform output must be captured or used [OK]
Quick Trick: Always store fit_transform output to use topic distributions [OK]
Common Mistakes:
MISTAKES
  • Ignoring fit_transform output
  • Thinking components_ attribute is missing
  • Believing n_components must match document count

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More NLP Quizzes