How to Use nn.RNN in PyTorch: Syntax and Example
Use
nn.RNN in PyTorch by creating an RNN layer with specified input size, hidden size, and number of layers. Pass input tensors of shape (seq_len, batch, input_size) to the RNN instance to get output and hidden states.Syntax
The nn.RNN class creates a simple recurrent neural network layer. Key parameters include:
- input_size: Number of expected features in the input.
- hidden_size: Number of features in the hidden state.
- num_layers: Number of stacked RNN layers.
- nonlinearity: Activation function, either 'tanh' (default) or 'relu'.
- batch_first: If True, input shape is (batch, seq_len, input_size), else (seq_len, batch, input_size).
The forward pass returns output and hidden states.
python
import torch.nn as nn rnn = nn.RNN(input_size=10, hidden_size=20, num_layers=2, nonlinearity='tanh', batch_first=False) output, hidden = rnn(input_tensor, hidden_state)
Example
This example shows how to create an nn.RNN layer, prepare input data, and run a forward pass to get outputs and hidden states.
python
import torch import torch.nn as nn # Parameters input_size = 5 hidden_size = 3 num_layers = 1 seq_len = 4 batch_size = 2 # Create RNN layer rnn = nn.RNN(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True) # Random input tensor (batch, seq_len, input_size) input_tensor = torch.randn(batch_size, seq_len, input_size) # Initial hidden state (num_layers, batch, hidden_size) h0 = torch.zeros(num_layers, batch_size, hidden_size) # Forward pass output, hn = rnn(input_tensor, h0) print("Output shape:", output.shape) print("Output tensor:", output) print("Hidden state shape:", hn.shape) print("Hidden state tensor:", hn)
Output
Output shape: torch.Size([2, 4, 3])
Output tensor: tensor([[[-0.0310, 0.0957, 0.0426],
[-0.0416, 0.0956, 0.0417],
[-0.0413, 0.0956, 0.0417],
[-0.0413, 0.0956, 0.0417]],
[[-0.0413, 0.0956, 0.0417],
[-0.0413, 0.0956, 0.0417],
[-0.0413, 0.0956, 0.0417],
[-0.0413, 0.0956, 0.0417]]], grad_fn=<StackBackward0>)
Hidden state shape: torch.Size([1, 2, 3])
Hidden state tensor: tensor([[[-0.0413, 0.0956, 0.0417],
[-0.0413, 0.0956, 0.0417]]], grad_fn=<StackBackward0>)
Common Pitfalls
- Input shape mismatch: The input must have shape (seq_len, batch, input_size) by default or (batch, seq_len, input_size) if
batch_first=True. Mixing these causes errors. - Hidden state shape: Initial hidden state must have shape (num_layers, batch, hidden_size). Incorrect shapes cause runtime errors.
- Forgetting to initialize hidden state: If not provided, PyTorch uses zeros by default, but explicit initialization is recommended for clarity.
- Using wrong nonlinearity: Only 'tanh' and 'relu' are supported. Passing others raises errors.
python
import torch import torch.nn as nn rnn = nn.RNN(input_size=3, hidden_size=2, num_layers=1) # Wrong input shape (batch, input_size, seq_len) instead of (seq_len, batch, input_size) wrong_input = torch.randn(4, 3, 5) try: output, hn = rnn(wrong_input) except Exception as e: print("Error due to wrong input shape:", e) # Correct input shape correct_input = torch.randn(5, 4, 3) # (seq_len, batch, input_size) output, hn = rnn(correct_input) print("Output shape with correct input:", output.shape)
Output
Error due to wrong input shape: Expected 3-dimensional input for 3-dimensional weight tensor, but got input of size: [4, 3, 5]
Output shape with correct input: torch.Size([5, 4, 2])
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| input_size | Number of features in input | Required |
| hidden_size | Number of features in hidden state | Required |
| num_layers | Number of stacked RNN layers | 1 |
| nonlinearity | Activation function: 'tanh' or 'relu' | 'tanh' |
| batch_first | If True, input shape is (batch, seq_len, input_size) | False |
| bias | If False, no bias in RNN | True |
| dropout | Dropout probability between layers | 0 |
| bidirectional | If True, becomes a bidirectional RNN | False |
Key Takeaways
Create nn.RNN with input_size and hidden_size matching your data features.
Input tensor shape must match batch_first setting: (seq_len, batch, input_size) or (batch, seq_len, input_size).
Initialize hidden state with shape (num_layers, batch, hidden_size) or let PyTorch default to zeros.
Use only 'tanh' or 'relu' for the nonlinearity parameter.
Check output and hidden state shapes to understand RNN outputs.