Complete the code to tokenize the input text into words.
tokens = text.[1]()The split() method breaks the text into words by spaces, which is the basic step in information retrieval.
Complete the code to count the frequency of each word in the list.
from collections import [1] word_counts = [1](words)
Counter is a special dictionary to count hashable objects, perfect for word frequency.
Fix the error in the code to compute the term frequency (TF) for a word.
tf = word_counts[[1]] / sum(word_counts.values())
The key to access the count must be a string representing the word, so it needs quotes.
Fill both blanks to create a dictionary of words with frequency greater than 1.
freq_words = {word: count for word, count in word_counts.items() if count [1] [2]The condition filters words with count greater than 1, so use > and 1.
Fill all three blanks to compute inverse document frequency (IDF) for a word.
import math idf = math.log([1] / (1 + [2][[3]]))
IDF is log of total documents divided by (1 + document frequency of the word). The word key must be a string.