Recall & Review
beginner
What is the purpose of text preprocessing before feeding data into an RNN?
Text preprocessing cleans and converts raw text into a numerical format that an RNN can understand and learn from. It helps improve model performance and training speed.
Click to reveal answer
beginner
Why do we convert words into integers (tokenization) for RNN input?
RNNs work with numbers, not words. Tokenization assigns each unique word a number so the model can process sequences of numbers representing sentences.
Click to reveal answer
beginner
What is padding in text preprocessing for RNNs?
Padding adds extra tokens (usually zeros) to make all input sequences the same length. This allows batch processing in RNNs without errors.
Click to reveal answer
intermediate
How does PyTorch's `torch.nn.utils.rnn.pack_padded_sequence` help with variable-length sequences?
It lets the RNN ignore padded parts of sequences by packing only the real data, improving efficiency and preventing the model from learning from padding.
Click to reveal answer
beginner
What role does a vocabulary dictionary play in text preprocessing for RNNs?
It maps each unique word to a unique integer index, enabling consistent tokenization and lookup during training and inference.
Click to reveal answer
Why do we need to pad sequences before feeding them into an RNN?
✗ Incorrect
Padding ensures all sequences have the same length so they can be processed together in batches by the RNN.
What does tokenization do in text preprocessing?
✗ Incorrect
Tokenization assigns each word a unique number so the model can process text as numbers.
Which PyTorch function helps handle padded sequences efficiently in RNNs?
✗ Incorrect
pack_padded_sequence allows the RNN to skip padded tokens during processing.
What is the main reason to build a vocabulary dictionary in text preprocessing?
✗ Incorrect
The vocabulary dictionary maps each unique word to an integer index for tokenization.
Which of these is NOT a typical step in text preprocessing for RNNs?
✗ Incorrect
Image resizing is unrelated to text preprocessing.
Explain the key steps involved in preparing text data for training an RNN model.
Think about how raw text becomes numbers and how sequences are made uniform.
You got /4 concepts.
Describe how PyTorch helps handle variable-length text sequences when training RNNs.
Focus on PyTorch utilities that manage padded sequences.
You got /3 concepts.