Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to create a document-term matrix using CountVectorizer.
NLP
from sklearn.feature_extraction.text import CountVectorizer docs = ['I love AI', 'AI loves me'] vectorizer = CountVectorizer() dtm = vectorizer.[1](docs) print(dtm.toarray())
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using transform before fitting the vectorizer.
Calling fit without transforming the data.
✗ Incorrect
The fit_transform method learns the vocabulary and transforms the documents into a document-term matrix in one step.
2fill in blank
mediumComplete the code to get the feature names (words) from the vectorizer.
NLP
from sklearn.feature_extraction.text import CountVectorizer docs = ['Data science is fun'] vectorizer = CountVectorizer() dtm = vectorizer.fit_transform(docs) words = vectorizer.[1]() print(words)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using get_feature_names which is deprecated.
Trying to access vocabulary_ directly instead of using the method.
✗ Incorrect
get_feature_names_out() returns the list of feature names (words) after fitting the vectorizer.
3fill in blank
hardFix the error in the code to correctly create a document-term matrix from the list of documents.
NLP
from sklearn.feature_extraction.text import CountVectorizer docs = ['Machine learning', 'Learning machines'] vectorizer = CountVectorizer() dtm = vectorizer.[1](docs) print(dtm.toarray())
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using transform without fitting first.
Using fit without transforming.
✗ Incorrect
fit_transform must be used to both learn the vocabulary and transform the documents into a matrix.
4fill in blank
hardFill both blanks to create a document-term matrix and get the feature names.
NLP
from sklearn.feature_extraction.text import CountVectorizer docs = ['AI is amazing', 'Amazing AI'] vectorizer = CountVectorizer() dtm = vectorizer.[1](docs) features = vectorizer.[2]() print(features)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using transform instead of fit_transform.
Using deprecated get_feature_names method.
✗ Incorrect
Use fit_transform to create the matrix and get_feature_names_out to get the feature names.
5fill in blank
hardFill all three blanks to create a document-term matrix, get feature names, and print the matrix as an array.
NLP
from sklearn.feature_extraction.text import CountVectorizer docs = ['Deep learning', 'Learning deep'] vectorizer = CountVectorizer() dtm = vectorizer.[1](docs) features = vectorizer.[2]() print(dtm.[3]())
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using transform instead of fit_transform.
Using get_feature_names instead of get_feature_names_out.
Forgetting to convert the matrix to an array before printing.
✗ Incorrect
fit_transform creates the matrix, get_feature_names_out gets the words, and toarray converts the matrix to a readable array.