Recurrent Neural Networks (RNNs) are special because they can remember information from earlier in a sequence. Why is this possible?
Think about how RNNs connect one step to the next in a sequence.
RNNs have loops in their structure that allow information to flow from one time step to the next, helping them remember past inputs.
Consider this PyTorch code snippet creating an RNN and passing a batch of sequences through it. What is the shape of the output tensor?
import torch import torch.nn as nn rnn = nn.RNN(input_size=5, hidden_size=3, num_layers=1, batch_first=True) input_seq = torch.randn(4, 7, 5) # batch=4, seq_len=7, input_size=5 output, hidden = rnn(input_seq) print(output.shape)
Remember that batch_first=True means batch size is the first dimension.
The output shape is (batch_size, sequence_length, hidden_size) when batch_first=True.
You want to build a model that predicts the next word in a sentence. Which model type is best suited to handle the sequence nature of this task?
Think about which model can remember previous words in a sentence.
RNNs are designed to handle sequences by processing inputs step-by-step and keeping track of past information, making them ideal for predicting the next word.
In an RNN, what is the effect of increasing the hidden_size parameter?
Think about what hidden_size controls inside the RNN cell.
The hidden_size controls the size of the hidden state vector, which stores information from previous steps. Larger hidden_size means more capacity to learn complex patterns.
You trained an RNN to predict the next character in a text sequence. After training, you want to measure how well it predicts. Which metric is most appropriate to evaluate this task?
Think about what it means to predict the next character correctly.
Accuracy measures how often the predicted character matches the true next character, which is the right metric for classification tasks like next character prediction.