0
0
NLPml~10 mins

Spam detection pipeline in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the necessary library for text vectorization.

NLP
from sklearn.feature_extraction.text import [1]
Drag options to blanks, or click blank then click option'
ATfidfVectorizer
BStandardScaler
CLabelEncoder
DCountVectorizer
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing CountVectorizer which counts words but doesn't weigh them.
Using LabelEncoder which is for labels, not text features.
2fill in blank
medium

Complete the code to split the dataset into training and testing sets.

NLP
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=[1], random_state=42)
Drag options to blanks, or click blank then click option'
A1.0
B0.5
C0.1
D0.2
Attempts:
3 left
💡 Hint
Common Mistakes
Using 1.0 which means all data is test set, leaving no training data.
Using 0.5 which splits data evenly but reduces training data too much.
3fill in blank
hard

Fix the error in the model training code by completing the missing classifier name.

NLP
from sklearn.naive_bayes import [1]
model = [1]()
model.fit(X_train, y_train)
Drag options to blanks, or click blank then click option'
AGaussianNB
BBernoulliNB
CMultinomialNB
DComplementNB
Attempts:
3 left
💡 Hint
Common Mistakes
Using GaussianNB which assumes continuous features, not suitable for text counts.
Using BernoulliNB which is for binary features, less common for tf-idf.
4fill in blank
hard

Fill both blanks to create a dictionary comprehension that maps each word to its length if the length is greater than 3.

NLP
word_lengths = {word: [1] for word in words if len(word) [2] 3}
Drag options to blanks, or click blank then click option'
Alen(word)
B>
C<
Dword
Attempts:
3 left
💡 Hint
Common Mistakes
Using the word itself as value instead of its length.
Using less than '<' instead of greater than '>' in the condition.
5fill in blank
hard

Fill all three blanks to create a filtered dictionary from data where keys are uppercase and values are positive.

NLP
filtered = [1]: [2] for k, v in data.items() if v [3] 0}
Drag options to blanks, or click blank then click option'
Ak.upper()
Bv
C>
Dk.lower()
Attempts:
3 left
💡 Hint
Common Mistakes
Using lowercase keys instead of uppercase.
Using '<' instead of '>' in the condition.