Bird
Raised Fist0
PyTorchml~20 mins

nn.RNN layer in PyTorch - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
RNN Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output shape of nn.RNN with batch_first=True
Consider the following PyTorch code using nn.RNN with batch_first=True. What is the shape of the output tensor `out`?
PyTorch
import torch
import torch.nn as nn

rnn = nn.RNN(input_size=5, hidden_size=3, num_layers=1, batch_first=True)
input_tensor = torch.randn(4, 7, 5)  # batch=4, seq_len=7, input_size=5
out, hn = rnn(input_tensor)
print(out.shape)
Atorch.Size([4, 7, 3])
Btorch.Size([7, 4, 3])
Ctorch.Size([4, 3, 7])
Dtorch.Size([7, 3, 4])
Attempts:
2 left
💡 Hint
Remember that batch_first=True means the batch dimension is the first dimension in the input and output.
Model Choice
intermediate
2:00remaining
Choosing RNN parameters for sequence classification
You want to build a simple RNN model to classify sequences of length 10 with 8 features each into 4 classes. Which nn.RNN configuration is correct for this task?
Ann.RNN(input_size=8, hidden_size=4, num_layers=1, batch_first=False)
Bnn.RNN(input_size=10, hidden_size=4, num_layers=1, batch_first=False)
Cnn.RNN(input_size=4, hidden_size=10, num_layers=2, batch_first=True)
Dnn.RNN(input_size=8, hidden_size=16, num_layers=1, batch_first=True)
Attempts:
2 left
💡 Hint
Input size should match the number of features per time step.
Hyperparameter
advanced
2:00remaining
Effect of increasing num_layers in nn.RNN
What is the main effect of increasing the num_layers parameter in nn.RNN from 1 to 3?
AThe input size of the RNN will automatically triple.
BThe RNN will have 3 stacked recurrent layers, allowing it to learn more complex patterns.
CThe output size of the RNN will be three times larger.
DThe RNN will process sequences three times faster.
Attempts:
2 left
💡 Hint
Think about what stacking layers means in neural networks.
🔧 Debug
advanced
2:00remaining
Identifying error in nn.RNN input shape
What error will this code raise when running the RNN forward pass?
PyTorch
import torch
import torch.nn as nn

rnn = nn.RNN(input_size=6, hidden_size=4, num_layers=1)
input_tensor = torch.randn(5, 6, 6)  # batch=5, seq_len=6, input_size=6
out, hn = rnn(input_tensor)
ARuntimeError: Expected input of shape (seq_len, batch, input_size), but got (batch, seq_len, input_size)
BNo error, runs successfully
CTypeError: input_tensor must be a list, not a tensor
DValueError: hidden_size must be equal to input_size
Attempts:
2 left
💡 Hint
Check the default expected input shape for nn.RNN when batch_first is False.
Metrics
expert
2:00remaining
Interpreting training loss behavior of nn.RNN model
You train an nn.RNN model for sequence prediction. The training loss decreases steadily, but the validation loss starts increasing after some epochs. What does this indicate?
AThe learning rate is too low, causing slow convergence.
BThe model is underfitting and needs more training.
CThe model is overfitting the training data and not generalizing well.
DThe batch size is too large, causing unstable training.
Attempts:
2 left
💡 Hint
Think about what it means when validation loss increases while training loss decreases.

Practice

(1/5)
1. What does the nn.RNN layer in PyTorch primarily do?
easy
A. Processes sequences step by step, keeping track of past information
B. Sorts input data in ascending order
C. Generates random numbers for initialization
D. Performs matrix multiplication without memory

Solution

  1. Step 1: Understand the purpose of RNN

    The RNN layer is designed to handle sequential data by processing one step at a time and remembering previous steps.
  2. Step 2: Compare options with RNN behavior

    Only Processes sequences step by step, keeping track of past information describes this behavior correctly; others describe unrelated functions.
  3. Final Answer:

    Processes sequences step by step, keeping track of past information -> Option A
  4. Quick Check:

    RNN remembers past inputs = A [OK]
Hint: RNNs remember past steps in sequences [OK]
Common Mistakes:
  • Thinking RNN sorts data
  • Confusing RNN with random number generators
  • Assuming RNN does simple matrix multiplication only
2. Which of the following is the correct way to create an RNN layer with input size 10 and hidden size 20 in PyTorch?
easy
A. nn.RNN(20, 10)
B. nn.RNN(10)
C. nn.RNN(input_size=10, hidden_size=20)
D. nn.RNN(hidden_size=10, input_size=20)

Solution

  1. Step 1: Recall nn.RNN constructor parameters

    The constructor requires input_size first, then hidden_size, e.g., nn.RNN(input_size=10, hidden_size=20).
  2. Step 2: Check each option

    Only nn.RNN(input_size=10, hidden_size=20) matches the correct parameter order and names; the others reverse sizes, omit hidden_size, or swap parameters.
  3. Final Answer:

    nn.RNN(input_size=10, hidden_size=20) -> Option C
  4. Quick Check:

    Input size first, hidden size second = D [OK]
Hint: Remember: input_size before hidden_size in nn.RNN [OK]
Common Mistakes:
  • Swapping input_size and hidden_size
  • Omitting hidden_size parameter
  • Using positional args in wrong order
3. Given the code below, what is the shape of output after running the RNN?
import torch
import torch.nn as nn
rnn = nn.RNN(input_size=5, hidden_size=3, batch_first=True)
input = torch.randn(4, 7, 5)  # batch=4, seq_len=7, input_size=5
output, hn = rnn(input)
medium
A. (7, 4, 3)
B. (3, 4, 7)
C. (4, 3, 7)
D. (4, 7, 3)

Solution

  1. Step 1: Understand batch_first=True effect

    With batch_first=True, input shape is (batch, seq_len, input_size), so output shape is (batch, seq_len, hidden_size).
  2. Step 2: Apply shapes to given input

    Input shape is (4, 7, 5), so output shape is (4, 7, 3) because hidden_size=3.
  3. Final Answer:

    (4, 7, 3) -> Option D
  4. Quick Check:

    Output shape = (batch, seq_len, hidden_size) = B [OK]
Hint: batch_first=True means batch is first dimension [OK]
Common Mistakes:
  • Confusing batch and sequence length order
  • Ignoring batch_first parameter
  • Mixing hidden_size with input_size in output shape
4. What is wrong with this code snippet using nn.RNN?
rnn = nn.RNN(input_size=8, hidden_size=4)
input = torch.randn(3, 5, 10)  # batch=3, seq_len=5, input_size=10
output, hn = rnn(input)
medium
A. RNN requires input to be 2D tensor
B. Input size does not match the RNN's input_size parameter
C. Batch size should be last dimension
D. Hidden size must be equal to input size

Solution

  1. Step 1: Check input_size parameter vs input tensor

    The RNN expects input_size=8, but input tensor's last dimension is 10, causing mismatch.
  2. Step 2: Validate tensor shape requirements

    Input shape (3, 5, 10) means batch=3, seq_len=5, input_size=10, which conflicts with RNN's input_size=8.
  3. Final Answer:

    Input size does not match the RNN's input_size parameter -> Option B
  4. Quick Check:

    Input last dim must match input_size = C [OK]
Hint: Input last dimension must match RNN input_size [OK]
Common Mistakes:
  • Ignoring input_size mismatch
  • Thinking batch size is last dimension
  • Assuming RNN input is 2D tensor
5. You want to process a batch of sequences with varying lengths using nn.RNN. Which approach correctly handles this in PyTorch?
hard
A. Pad sequences to the same length and use pack_padded_sequence before the RNN
B. Feed sequences directly without padding or packing
C. Use a for loop to process each sequence separately without padding
D. Set hidden_size equal to the longest sequence length

Solution

  1. Step 1: Understand handling variable-length sequences

    PyTorch recommends padding sequences to equal length and using pack_padded_sequence to inform RNN about actual lengths.
  2. Step 2: Evaluate options for best practice

    Pad sequences to the same length and use pack_padded_sequence before the RNN correctly describes this approach. Options B and C ignore padding/packing, causing errors or inefficiency. Set hidden_size equal to the longest sequence length is unrelated to sequence length handling.
  3. Final Answer:

    Pad sequences to the same length and use pack_padded_sequence before the RNN -> Option A
  4. Quick Check:

    Use padding + pack_padded_sequence for variable lengths = A [OK]
Hint: Pad and pack sequences before RNN for variable lengths [OK]
Common Mistakes:
  • Feeding raw variable-length sequences directly
  • Ignoring packing after padding
  • Misusing hidden_size for sequence length