0
0
NLPml~10 mins

Latent Dirichlet Allocation (LDA) in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the LDA model from scikit-learn.

NLP
from sklearn.decomposition import [1]
Drag options to blanks, or click blank then click option'
AKMeans
BPCA
CLatentDirichletAllocation
DTruncatedSVD
Attempts:
3 left
💡 Hint
Common Mistakes
Importing PCA or KMeans instead of LatentDirichletAllocation.
Misspelling the class name.
2fill in blank
medium

Complete the code to create an LDA model with 5 topics.

NLP
lda = LatentDirichletAllocation(n_components=[1], random_state=42)
Drag options to blanks, or click blank then click option'
A5
B3
C10
D1
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing a different number of topics than requested.
Confusing n_components with other parameters.
3fill in blank
hard

Fix the error in the code to fit the LDA model on the document-term matrix named 'dtm'.

NLP
lda.fit([1])
Drag options to blanks, or click blank then click option'
Adocuments
Bdtm
Ctopics
Dlabels
Attempts:
3 left
💡 Hint
Common Mistakes
Passing raw documents or labels instead of the document-term matrix.
Using a variable that does not exist.
4fill in blank
hard

Fill both blanks to get the topic distribution for the first document.

NLP
topic_distribution = lda.[1]([2])[0]
Drag options to blanks, or click blank then click option'
Atransform
Bdtm
Cfit_transform
Ddocuments
Attempts:
3 left
💡 Hint
Common Mistakes
Using fit_transform after fitting the model.
Passing raw documents instead of the document-term matrix.
5fill in blank
hard

Fill all three blanks to create a dictionary of top words per topic from the LDA components.

NLP
top_words = {i: [feature_names[j] for j in lda.components_[i].argsort()[-[1]:][::-1]] for i in range([2])}
print(top_words)

# feature_names = vectorizer.get_feature_names_out()
# lda = LatentDirichletAllocation(n_components=[3])
Drag options to blanks, or click blank then click option'
A10
B5
Clda.n_components
Dlda.components_.shape[0]
Attempts:
3 left
💡 Hint
Common Mistakes
Using a wrong number of top words like 10 instead of 5.
Confusing the number of topics with the number of words.
Using a fixed number instead of dynamic attributes.