What is nn.RNN layer in PyTorch?

The nn.RNN layer helps a computer remember information from a sequence, like words in a sentence, so it can understand patterns over time.

nn.RNN layer in PyTorch - Syntax, Examples & Explanation

Practice

(1/5)

1. What does the nn.RNN layer in PyTorch primarily do?

easy

A. Processes sequences step by step, keeping track of past information

B. Sorts input data in ascending order

C. Generates random numbers for initialization

D. Performs matrix multiplication without memory

Solution

Step 1: Understand the purpose of RNN
The RNN layer is designed to handle sequential data by processing one step at a time and remembering previous steps.
Step 2: Compare options with RNN behavior
Only Processes sequences step by step, keeping track of past information describes this behavior correctly; others describe unrelated functions.
Final Answer:
Processes sequences step by step, keeping track of past information -> Option A
Quick Check:
RNN remembers past inputs = A [OK]

Hint: RNNs remember past steps in sequences [OK]

Common Mistakes:

Thinking RNN sorts data
Confusing RNN with random number generators
Assuming RNN does simple matrix multiplication only

2. Which of the following is the correct way to create an RNN layer with input size 10 and hidden size 20 in PyTorch?

easy

A. nn.RNN(20, 10)

B. nn.RNN(10)

C. nn.RNN(input_size=10, hidden_size=20)

D. nn.RNN(hidden_size=10, input_size=20)

Solution

Step 1: Recall nn.RNN constructor parameters
The constructor requires input_size first, then hidden_size, e.g., nn.RNN(input_size=10, hidden_size=20).
Step 2: Check each option
Only nn.RNN(input_size=10, hidden_size=20) matches the correct parameter order and names; the others reverse sizes, omit hidden_size, or swap parameters.
Final Answer:
nn.RNN(input_size=10, hidden_size=20) -> Option C
Quick Check:
Input size first, hidden size second = D [OK]

Hint: Remember: input_size before hidden_size in nn.RNN [OK]

Common Mistakes:

Swapping input_size and hidden_size
Omitting hidden_size parameter
Using positional args in wrong order

3. Given the code below, what is the shape of output after running the RNN?

import torch
import torch.nn as nn
rnn = nn.RNN(input_size=5, hidden_size=3, batch_first=True)
input = torch.randn(4, 7, 5)  # batch=4, seq_len=7, input_size=5
output, hn = rnn(input)

medium

A. (7, 4, 3)

B. (3, 4, 7)

C. (4, 3, 7)

D. (4, 7, 3)

Solution

Step 1: Understand batch_first=True effect
With batch_first=True, input shape is (batch, seq_len, input_size), so output shape is (batch, seq_len, hidden_size).
Step 2: Apply shapes to given input
Input shape is (4, 7, 5), so output shape is (4, 7, 3) because hidden_size=3.
Final Answer:
(4, 7, 3) -> Option D
Quick Check:
Output shape = (batch, seq_len, hidden_size) = B [OK]

Hint: batch_first=True means batch is first dimension [OK]

Common Mistakes:

Confusing batch and sequence length order
Ignoring batch_first parameter
Mixing hidden_size with input_size in output shape

4. What is wrong with this code snippet using nn.RNN?

rnn = nn.RNN(input_size=8, hidden_size=4)
input = torch.randn(3, 5, 10)  # batch=3, seq_len=5, input_size=10
output, hn = rnn(input)

medium

A. RNN requires input to be 2D tensor

B. Input size does not match the RNN's input_size parameter

C. Batch size should be last dimension

D. Hidden size must be equal to input size

Solution

Step 1: Check input_size parameter vs input tensor
The RNN expects input_size=8, but input tensor's last dimension is 10, causing mismatch.
Step 2: Validate tensor shape requirements
Input shape (3, 5, 10) means batch=3, seq_len=5, input_size=10, which conflicts with RNN's input_size=8.
Final Answer:
Input size does not match the RNN's input_size parameter -> Option B
Quick Check:
Input last dim must match input_size = C [OK]

Hint: Input last dimension must match RNN input_size [OK]

Common Mistakes:

Ignoring input_size mismatch
Thinking batch size is last dimension
Assuming RNN input is 2D tensor

5. You want to process a batch of sequences with varying lengths using nn.RNN. Which approach correctly handles this in PyTorch?

hard

A. Pad sequences to the same length and use pack_padded_sequence before the RNN

B. Feed sequences directly without padding or packing

C. Use a for loop to process each sequence separately without padding

D. Set hidden_size equal to the longest sequence length

Solution

Step 1: Understand handling variable-length sequences
PyTorch recommends padding sequences to equal length and using pack_padded_sequence to inform RNN about actual lengths.
Step 2: Evaluate options for best practice
Pad sequences to the same length and use pack_padded_sequence before the RNN correctly describes this approach. Options B and C ignore padding/packing, causing errors or inefficiency. Set hidden_size equal to the longest sequence length is unrelated to sequence length handling.
Final Answer:
Pad sequences to the same length and use pack_padded_sequence before the RNN -> Option A
Quick Check:
Use padding + pack_padded_sequence for variable lengths = A [OK]

Hint: Pad and pack sequences before RNN for variable lengths [OK]

Common Mistakes:

Feeding raw variable-length sequences directly
Ignoring packing after padding
Misusing hidden_size for sequence length

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of RNN

Step 2: Compare options with RNN behavior

Final Answer:

Quick Check:

Solution

Step 1: Recall nn.RNN constructor parameters

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Understand batch_first=True effect

Step 2: Apply shapes to given input

Final Answer:

Quick Check:

Solution

Step 1: Check input_size parameter vs input tensor

Step 2: Validate tensor shape requirements

Final Answer:

Quick Check:

Solution

Step 1: Understand handling variable-length sequences

Step 2: Evaluate options for best practice

Final Answer:

Quick Check: