In a bidirectional RNN, what does the output at each time step represent?
Think about how information flows in both directions in a bidirectional RNN.
Bidirectional RNNs process sequences forward and backward, so the output at each time step includes context from both past and future relative to that step.
Given the following PyTorch code, what is the shape of output?
import torch import torch.nn as nn rnn = nn.RNN(input_size=10, hidden_size=20, num_layers=1, bidirectional=True) input_seq = torch.randn(5, 3, 10) # seq_len=5, batch=3, input_size=10 output, hidden = rnn(input_seq) print(output.shape)
Remember the output shape is (seq_len, batch, num_directions * hidden_size).
Since the RNN is bidirectional, the output hidden size doubles (20 * 2 = 40). The output shape is (5, 3, 40).
For which of the following tasks is using a bidirectional RNN most beneficial?
Consider if future context is available when making predictions.
Bidirectional RNNs use both past and future context, so they work best when the entire sequence is available, like sentence classification.
If you set hidden_size=50 in a bidirectional RNN with num_layers=1, what is the size of the hidden state tensor h_n returned by PyTorch?
Remember the first dimension of h_n is num_layers * num_directions.
With 1 layer and 2 directions, the first dimension is 2. Hidden size per direction is 50, so shape is (2, batch_size, 50).
Consider this PyTorch code snippet:
import torch import torch.nn as nn rnn = nn.RNN(input_size=8, hidden_size=16, bidirectional=True) input_seq = torch.randn(7, 4, 8) output, hidden = rnn(input_seq) last_output = output[-1] print(last_output.shape)
What is the shape of last_output and why might using it directly be problematic for sequence classification?
Think about what the last time step output means in a bidirectional RNN.
The output at the last time step concatenates forward output at last step and backward output at first step. Using it directly may not capture full sequence meaning for classification.