Practice

(1/5)

1. What is the main advantage of using a bidirectional RNN compared to a standard RNN?

easy

A. It processes the input sequence in both forward and backward directions to capture full context.

B. It uses fewer parameters to reduce model size.

C. It only processes sequences backward for faster training.

D. It replaces recurrent layers with convolutional layers.

Solution

Step 1: Understand standard RNN processing
Standard RNNs process sequences only in the forward direction, so they only see past context.
Step 2: Analyze bidirectional RNN behavior
Bidirectional RNNs process sequences both forward and backward, capturing past and future context.
Final Answer:
It processes the input sequence in both forward and backward directions to capture full context. -> Option A
Quick Check:
Bidirectional = forward + backward context [OK]

Hint: Bidirectional means reading sequence both ways [OK]

Common Mistakes:

Thinking bidirectional reduces parameters
Assuming it only reads backward
Confusing with convolutional layers

2. Which of the following is the correct way to create a bidirectional GRU layer in PyTorch?

easy

A. torch.nn.GRU(input_size=10, hidden_size=20, direction='both')

B. torch.nn.GRU(input_size=10, hidden_size=20, bidirectional=True)

C. torch.nn.GRU(input_size=10, hidden_size=20, bidirectional=False)

D. torch.nn.GRU(input_size=10, hidden_size=20, two_directions=True)

Solution

Step 1: Recall PyTorch GRU parameters
The bidirectional parameter is a boolean that enables bidirectional processing.
Step 2: Identify correct syntax
Only torch.nn.GRU(input_size=10, hidden_size=20, bidirectional=True) uses bidirectional=True, which is the correct PyTorch syntax.
Final Answer:
torch.nn.GRU(input_size=10, hidden_size=20, bidirectional=True) -> Option B
Quick Check:
bidirectional=True enables two directions [OK]

Hint: Use bidirectional=True to enable both directions [OK]

Common Mistakes:

Using invalid parameter names like 'direction' or 'two_directions'
Setting bidirectional=False by mistake
Confusing input_size and hidden_size

3. Given the following PyTorch code, what is the shape of the output tensor?

rnn = torch.nn.RNN(input_size=5, hidden_size=3, bidirectional=True, batch_first=True)
input = torch.randn(4, 7, 5)  # batch=4, seq_len=7, input_size=5
output, _ = rnn(input)

medium

A. [4, 7, 3]

B. [7, 4, 6]

C. [4, 7, 6]

D. [4, 3, 7]

Solution

Step 1: Understand output shape of bidirectional RNN
Output shape is (batch_size, seq_len, hidden_size * num_directions). Here, num_directions=2.
Step 2: Calculate output shape
hidden_size=3, so output last dimension = 3 * 2 = 6. Batch=4, seq_len=7, so output shape = [4, 7, 6].
Final Answer:
[4, 7, 6] -> Option C
Quick Check:
Output last dim = hidden_size * 2 [OK]

Hint: Output last dim doubles with bidirectional=True [OK]

Common Mistakes:

Forgetting to multiply hidden_size by 2
Mixing batch and sequence dimensions
Assuming output shape matches input exactly

4. You wrote this code but get a runtime error:

rnn = torch.nn.RNN(input_size=8, hidden_size=4, bidirectional=True)
input = torch.randn(5, 10, 8)
output, hidden = rnn(input)

What is the likely cause of the error?

medium

A. Input tensor shape should have batch_first=True or be transposed to (seq_len, batch, input_size).

B. hidden_size must be equal to input_size for bidirectional RNNs.

C. bidirectional=True is not supported for RNN layers.

D. The input tensor must be 2D, not 3D.

Solution

Step 1: Check default input shape for PyTorch RNN
By default, PyTorch RNN expects input shape (seq_len, batch, input_size) unless batch_first=True is set.
Step 2: Analyze given input shape
Input shape is (5, 10, 8) which is (batch, seq_len, input_size), but batch_first=True is not set, causing mismatch.
Final Answer:
Input tensor shape should have batch_first=True or be transposed to (seq_len, batch, input_size). -> Option A
Quick Check:
Default RNN input shape = (seq_len, batch, input_size) [OK]

Hint: Set batch_first=True if input shape is (batch, seq_len, input_size) [OK]

Common Mistakes:

Assuming bidirectional disables shape rules
Thinking hidden_size must match input_size
Passing 2D input instead of 3D

5. You want to build a sentiment analysis model using a bidirectional LSTM in PyTorch. The input sequences have variable lengths. Which approach correctly handles variable-length sequences with a bidirectional LSTM?

hard

A. Manually reverse sequences and concatenate outputs without using bidirectional=True.

B. Pad sequences to max length and feed directly without packing, with bidirectional=False.

C. Use only forward LSTM and ignore sequence lengths.

D. Use pack_padded_sequence before the LSTM and pad_packed_sequence after, with batch_first=True and bidirectional=True set.

Solution

Step 1: Understand variable-length sequence handling
PyTorch requires packing padded sequences to efficiently process variable-length inputs in RNNs.
Step 2: Apply packing with bidirectional LSTM
Use pack_padded_sequence before feeding to LSTM with bidirectional=True, then unpack with pad_packed_sequence.
Final Answer:
Use pack_padded_sequence before the LSTM and pad_packed_sequence after, with batch_first=True and bidirectional=True set. -> Option D
Quick Check:
Pack sequences for variable length + bidirectional LSTM [OK]

Hint: Pack sequences to handle variable lengths with bidirectional LSTM [OK]

Common Mistakes:

Ignoring packing and feeding padded sequences directly
Disabling bidirectional for variable lengths
Manually reversing sequences instead of using bidirectional flag

Why Bidirectional RNNs in PyTorch? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand standard RNN processing

Step 2: Analyze bidirectional RNN behavior

Final Answer:

Quick Check:

Solution

Step 1: Recall PyTorch GRU parameters

Step 2: Identify correct syntax

Final Answer:

Quick Check:

Solution

Step 1: Understand output shape of bidirectional RNN

Step 2: Calculate output shape

Final Answer:

Quick Check:

Solution

Step 1: Check default input shape for PyTorch RNN

Step 2: Analyze given input shape

Final Answer:

Quick Check:

Solution

Step 1: Understand variable-length sequence handling

Step 2: Apply packing with bidirectional LSTM

Final Answer:

Quick Check: