Complete the code to import the necessary library for text vectorization.
from sklearn.feature_extraction.text import [1]
The TfidfVectorizer converts text data into numerical features based on term frequency-inverse document frequency, which is commonly used in spam detection.
Complete the code to split the dataset into training and testing sets.
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=[1], random_state=42)
Using test_size=0.2 splits 20% of data for testing, which is a common practice to evaluate model performance.
Fix the error in the model training code by completing the missing classifier name.
from sklearn.naive_bayes import [1] model = [1]() model.fit(X_train, y_train)
MultinomialNB is the best choice for text classification tasks like spam detection because it works well with discrete features such as word counts or tf-idf.
Fill both blanks to create a dictionary comprehension that maps each word to its length if the length is greater than 3.
word_lengths = {word: [1] for word in words if len(word) [2] 3}The dictionary comprehension maps each word to its length using len(word) and filters words with length greater than 3 using >.
Fill all three blanks to create a filtered dictionary from data where keys are uppercase and values are positive.
filtered = [1]: [2] for k, v in data.items() if v [3] 0}
The dictionary comprehension uses k.upper() to convert keys to uppercase, keeps values v, and filters for values greater than zero using >.