Complete the code to tokenize the sentence into words.
tokens = sentence.[1]()The split() method breaks a sentence into words by spaces, which is basic tokenization.
Complete the code to convert all tokens to lowercase for uniformity.
tokens = [word.[1]() for word in tokens]
The lower() method converts all characters in a string to lowercase, helping uniform text processing.
Fix the error in the code to remove punctuation from each token.
import string clean_tokens = [word.strip(string.[1]) for word in tokens]
string.punctuation contains all punctuation characters to strip from words.
Fill both blanks to create a dictionary counting word frequencies.
word_freq = {word: tokens.[1](word) for word in tokens if word [2] ''}count counts occurrences of each word. The condition word != '' skips empty strings.
Fill all three blanks to filter out stopwords and create a list of important words.
stopwords = {'and', 'the', 'is', 'in'}
important_words = [word for word in tokens if word [1] stopwords and len(word) [2] [3]]We exclude stopwords with not in. We keep words longer than 2 characters using len(word) > 2.