0
0
NLPml~10 mins

LDA with scikit-learn in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the LDA model from scikit-learn.

NLP
from sklearn.decomposition import [1]
Drag options to blanks, or click blank then click option'
ALatentDirichletAllocation
BPCA
CTruncatedSVD
DKMeans
Attempts:
3 left
💡 Hint
Common Mistakes
Importing PCA or KMeans instead of LatentDirichletAllocation.
Using TruncatedSVD which is for dimensionality reduction, not topic modeling.
2fill in blank
medium

Complete the code to create an LDA model with 5 topics.

NLP
lda = LatentDirichletAllocation(n_components=[1], random_state=42)
Drag options to blanks, or click blank then click option'
A3
B10
C5
D1
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing 1 topic which is too few.
Choosing 3 or 10 which are not the requested 5 topics.
3fill in blank
hard

Fix the error in the code to fit the LDA model on the document-term matrix named 'dtm'.

NLP
lda.fit([1])
Drag options to blanks, or click blank then click option'
Adocuments
Bdtm
Clda
Dvectorizer
Attempts:
3 left
💡 Hint
Common Mistakes
Passing raw documents instead of the document-term matrix.
Passing the vectorizer or the model itself.
4fill in blank
hard

Fill both blanks to get the topic-word distribution and the top words for the first topic.

NLP
topic_word = lda.[1]_
top_words = [vectorizer.get_feature_names_out()[i] for i in topic_word[[2]].argsort()[-10:]]
Drag options to blanks, or click blank then click option'
Acomponents
Btransform
C0
D1
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'transform' which is a method, not an attribute.
Using index 1 which is the second topic, not the first.
5fill in blank
hard

Fill all three blanks to transform documents to topic distributions and print the topic distribution for the first document.

NLP
doc_topic_dist = lda.[1](dtm)
print(doc_topic_dist[[2]])
print(doc_topic_dist.shape[[3]])
Drag options to blanks, or click blank then click option'
Atransform
B0
Dfit_transform
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'fit_transform' which fits and transforms but is not always needed here.
Using wrong indices for document or shape.