Bird
Raised Fist0
PyTorchml~5 mins

nn.RNN layer in PyTorch - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the purpose of the nn.RNN layer in PyTorch?
The nn.RNN layer processes sequences of data by passing information from one time step to the next, allowing the model to learn patterns over time.
Click to reveal answer
beginner
What are the main inputs to an nn.RNN layer?
The main inputs are the sequence data (shape: seq_len, batch_size, input_size) and an optional initial hidden state (shape: num_layers * num_directions, batch_size, hidden_size).
Click to reveal answer
beginner
What does the 'hidden_size' parameter control in nn.RNN?
It controls the size of the hidden state vector, which stores information from previous time steps and affects the model's capacity to learn patterns.
Click to reveal answer
intermediate
How does nn.RNN handle multiple layers and directions?
You can set 'num_layers' to stack multiple RNN layers, and 'bidirectional=True' to process sequences forwards and backwards, doubling the output size.
Click to reveal answer
beginner
What are the outputs of nn.RNN layer?
It outputs 'output' (all hidden states for each time step) and 'hidden' (the last hidden state for each layer and direction).
Click to reveal answer
What shape should the input sequence to nn.RNN have?
Abatch_size, seq_len, input_size
Bseq_len, batch_size, input_size
Cinput_size, seq_len, batch_size
Dbatch_size, input_size, seq_len
What does setting 'bidirectional=True' do in nn.RNN?
AProcesses the sequence forwards and backwards
BStacks multiple RNN layers
CChanges the activation function
DDisables the hidden state
Which of these is NOT an output of nn.RNN?
AOutput for all time steps
BLast hidden state
CPredicted class labels
DNone of the above
What does the 'hidden_size' parameter affect?
ASize of the hidden state vector
BLength of the input sequence
CNumber of layers
DBatch size
How can you provide an initial hidden state to nn.RNN?
ABy setting a parameter during initialization
BYou cannot provide an initial hidden state
CBy modifying the input sequence
DBy passing it as the second argument to the forward method
Explain how the nn.RNN layer processes a sequence of data step-by-step.
Think about how information flows through time steps in the RNN.
You got /4 concepts.
    Describe the difference between a unidirectional and bidirectional nn.RNN layer.
    Consider how reading the sequence backwards adds information.
    You got /3 concepts.

      Practice

      (1/5)
      1. What does the nn.RNN layer in PyTorch primarily do?
      easy
      A. Processes sequences step by step, keeping track of past information
      B. Sorts input data in ascending order
      C. Generates random numbers for initialization
      D. Performs matrix multiplication without memory

      Solution

      1. Step 1: Understand the purpose of RNN

        The RNN layer is designed to handle sequential data by processing one step at a time and remembering previous steps.
      2. Step 2: Compare options with RNN behavior

        Only Processes sequences step by step, keeping track of past information describes this behavior correctly; others describe unrelated functions.
      3. Final Answer:

        Processes sequences step by step, keeping track of past information -> Option A
      4. Quick Check:

        RNN remembers past inputs = A [OK]
      Hint: RNNs remember past steps in sequences [OK]
      Common Mistakes:
      • Thinking RNN sorts data
      • Confusing RNN with random number generators
      • Assuming RNN does simple matrix multiplication only
      2. Which of the following is the correct way to create an RNN layer with input size 10 and hidden size 20 in PyTorch?
      easy
      A. nn.RNN(20, 10)
      B. nn.RNN(10)
      C. nn.RNN(input_size=10, hidden_size=20)
      D. nn.RNN(hidden_size=10, input_size=20)

      Solution

      1. Step 1: Recall nn.RNN constructor parameters

        The constructor requires input_size first, then hidden_size, e.g., nn.RNN(input_size=10, hidden_size=20).
      2. Step 2: Check each option

        Only nn.RNN(input_size=10, hidden_size=20) matches the correct parameter order and names; the others reverse sizes, omit hidden_size, or swap parameters.
      3. Final Answer:

        nn.RNN(input_size=10, hidden_size=20) -> Option C
      4. Quick Check:

        Input size first, hidden size second = D [OK]
      Hint: Remember: input_size before hidden_size in nn.RNN [OK]
      Common Mistakes:
      • Swapping input_size and hidden_size
      • Omitting hidden_size parameter
      • Using positional args in wrong order
      3. Given the code below, what is the shape of output after running the RNN?
      import torch
      import torch.nn as nn
      rnn = nn.RNN(input_size=5, hidden_size=3, batch_first=True)
      input = torch.randn(4, 7, 5)  # batch=4, seq_len=7, input_size=5
      output, hn = rnn(input)
      medium
      A. (7, 4, 3)
      B. (3, 4, 7)
      C. (4, 3, 7)
      D. (4, 7, 3)

      Solution

      1. Step 1: Understand batch_first=True effect

        With batch_first=True, input shape is (batch, seq_len, input_size), so output shape is (batch, seq_len, hidden_size).
      2. Step 2: Apply shapes to given input

        Input shape is (4, 7, 5), so output shape is (4, 7, 3) because hidden_size=3.
      3. Final Answer:

        (4, 7, 3) -> Option D
      4. Quick Check:

        Output shape = (batch, seq_len, hidden_size) = B [OK]
      Hint: batch_first=True means batch is first dimension [OK]
      Common Mistakes:
      • Confusing batch and sequence length order
      • Ignoring batch_first parameter
      • Mixing hidden_size with input_size in output shape
      4. What is wrong with this code snippet using nn.RNN?
      rnn = nn.RNN(input_size=8, hidden_size=4)
      input = torch.randn(3, 5, 10)  # batch=3, seq_len=5, input_size=10
      output, hn = rnn(input)
      medium
      A. RNN requires input to be 2D tensor
      B. Input size does not match the RNN's input_size parameter
      C. Batch size should be last dimension
      D. Hidden size must be equal to input size

      Solution

      1. Step 1: Check input_size parameter vs input tensor

        The RNN expects input_size=8, but input tensor's last dimension is 10, causing mismatch.
      2. Step 2: Validate tensor shape requirements

        Input shape (3, 5, 10) means batch=3, seq_len=5, input_size=10, which conflicts with RNN's input_size=8.
      3. Final Answer:

        Input size does not match the RNN's input_size parameter -> Option B
      4. Quick Check:

        Input last dim must match input_size = C [OK]
      Hint: Input last dimension must match RNN input_size [OK]
      Common Mistakes:
      • Ignoring input_size mismatch
      • Thinking batch size is last dimension
      • Assuming RNN input is 2D tensor
      5. You want to process a batch of sequences with varying lengths using nn.RNN. Which approach correctly handles this in PyTorch?
      hard
      A. Pad sequences to the same length and use pack_padded_sequence before the RNN
      B. Feed sequences directly without padding or packing
      C. Use a for loop to process each sequence separately without padding
      D. Set hidden_size equal to the longest sequence length

      Solution

      1. Step 1: Understand handling variable-length sequences

        PyTorch recommends padding sequences to equal length and using pack_padded_sequence to inform RNN about actual lengths.
      2. Step 2: Evaluate options for best practice

        Pad sequences to the same length and use pack_padded_sequence before the RNN correctly describes this approach. Options B and C ignore padding/packing, causing errors or inefficiency. Set hidden_size equal to the longest sequence length is unrelated to sequence length handling.
      3. Final Answer:

        Pad sequences to the same length and use pack_padded_sequence before the RNN -> Option A
      4. Quick Check:

        Use padding + pack_padded_sequence for variable lengths = A [OK]
      Hint: Pad and pack sequences before RNN for variable lengths [OK]
      Common Mistakes:
      • Feeding raw variable-length sequences directly
      • Ignoring packing after padding
      • Misusing hidden_size for sequence length