0
0
NLPml~10 mins

TF-IDF (TfidfVectorizer) in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the TfidfVectorizer from scikit-learn.

NLP
from sklearn.feature_extraction.text import [1]
Drag options to blanks, or click blank then click option'
ALabelEncoder
BCountVectorizer
CStandardScaler
DTfidfVectorizer
Attempts:
3 left
💡 Hint
Common Mistakes
Importing CountVectorizer instead of TfidfVectorizer
Importing unrelated classes like StandardScaler or LabelEncoder
2fill in blank
medium

Complete the code to create a TfidfVectorizer instance with English stop words removed.

NLP
vectorizer = TfidfVectorizer(stop_words=[1])
Drag options to blanks, or click blank then click option'
ANone
BTrue
C'english'
DFalse
Attempts:
3 left
💡 Hint
Common Mistakes
Using True or False instead of 'english'
Using None which means no stop words are removed
3fill in blank
hard

Fix the error in the code to transform the documents into TF-IDF features.

NLP
tfidf_matrix = vectorizer.[1](documents)
Drag options to blanks, or click blank then click option'
Afit_transform
Bfit
Ctransform_fit
Dfit_transformer
Attempts:
3 left
💡 Hint
Common Mistakes
Using fit alone which does not transform data
Using non-existent methods like transform_fit or fit_transformer
4fill in blank
hard

Fill both blanks to get the feature names and convert the TF-IDF matrix to a dense array.

NLP
feature_names = vectorizer.[1]()
dense_matrix = tfidf_matrix.[2]()
Drag options to blanks, or click blank then click option'
Aget_feature_names_out
Btoarray
Cfit_transform
Dtransform
Attempts:
3 left
💡 Hint
Common Mistakes
Using fit_transform instead of get_feature_names_out
Using transform instead of toarray
5fill in blank
hard

Fill all three blanks to create a TfidfVectorizer with max 1000 features, fit and transform documents, and get feature names.

NLP
vectorizer = TfidfVectorizer(max_features=[1])
tfidf_matrix = vectorizer.[2](documents)
features = vectorizer.[3]()
Drag options to blanks, or click blank then click option'
A1000
Bfit_transform
Cget_feature_names_out
D500
Attempts:
3 left
💡 Hint
Common Mistakes
Using 500 instead of 1000 for max_features
Using fit instead of fit_transform
Using get_feature_names instead of get_feature_names_out