0
0
PytorchHow-ToBeginner · 4 min read

How to Use nn.RNN in PyTorch: Syntax and Example

Use nn.RNN in PyTorch by creating an RNN layer with specified input size, hidden size, and number of layers. Pass input tensors of shape (seq_len, batch, input_size) to the RNN instance to get output and hidden states.
📐

Syntax

The nn.RNN class creates a simple recurrent neural network layer. Key parameters include:

  • input_size: Number of expected features in the input.
  • hidden_size: Number of features in the hidden state.
  • num_layers: Number of stacked RNN layers.
  • nonlinearity: Activation function, either 'tanh' (default) or 'relu'.
  • batch_first: If True, input shape is (batch, seq_len, input_size), else (seq_len, batch, input_size).

The forward pass returns output and hidden states.

python
import torch.nn as nn
rnn = nn.RNN(input_size=10, hidden_size=20, num_layers=2, nonlinearity='tanh', batch_first=False)
output, hidden = rnn(input_tensor, hidden_state)
💻

Example

This example shows how to create an nn.RNN layer, prepare input data, and run a forward pass to get outputs and hidden states.

python
import torch
import torch.nn as nn

# Parameters
input_size = 5
hidden_size = 3
num_layers = 1
seq_len = 4
batch_size = 2

# Create RNN layer
rnn = nn.RNN(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)

# Random input tensor (batch, seq_len, input_size)
input_tensor = torch.randn(batch_size, seq_len, input_size)

# Initial hidden state (num_layers, batch, hidden_size)
h0 = torch.zeros(num_layers, batch_size, hidden_size)

# Forward pass
output, hn = rnn(input_tensor, h0)

print("Output shape:", output.shape)
print("Output tensor:", output)
print("Hidden state shape:", hn.shape)
print("Hidden state tensor:", hn)
Output
Output shape: torch.Size([2, 4, 3]) Output tensor: tensor([[[-0.0310, 0.0957, 0.0426], [-0.0416, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417]], [[-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417]]], grad_fn=<StackBackward0>) Hidden state shape: torch.Size([1, 2, 3]) Hidden state tensor: tensor([[[-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417]]], grad_fn=<StackBackward0>)
⚠️

Common Pitfalls

  • Input shape mismatch: The input must have shape (seq_len, batch, input_size) by default or (batch, seq_len, input_size) if batch_first=True. Mixing these causes errors.
  • Hidden state shape: Initial hidden state must have shape (num_layers, batch, hidden_size). Incorrect shapes cause runtime errors.
  • Forgetting to initialize hidden state: If not provided, PyTorch uses zeros by default, but explicit initialization is recommended for clarity.
  • Using wrong nonlinearity: Only 'tanh' and 'relu' are supported. Passing others raises errors.
python
import torch
import torch.nn as nn

rnn = nn.RNN(input_size=3, hidden_size=2, num_layers=1)

# Wrong input shape (batch, input_size, seq_len) instead of (seq_len, batch, input_size)
wrong_input = torch.randn(4, 3, 5)

try:
    output, hn = rnn(wrong_input)
except Exception as e:
    print("Error due to wrong input shape:", e)

# Correct input shape
correct_input = torch.randn(5, 4, 3)  # (seq_len, batch, input_size)
output, hn = rnn(correct_input)
print("Output shape with correct input:", output.shape)
Output
Error due to wrong input shape: Expected 3-dimensional input for 3-dimensional weight tensor, but got input of size: [4, 3, 5] Output shape with correct input: torch.Size([5, 4, 2])
📊

Quick Reference

ParameterDescriptionDefault
input_sizeNumber of features in inputRequired
hidden_sizeNumber of features in hidden stateRequired
num_layersNumber of stacked RNN layers1
nonlinearityActivation function: 'tanh' or 'relu''tanh'
batch_firstIf True, input shape is (batch, seq_len, input_size)False
biasIf False, no bias in RNNTrue
dropoutDropout probability between layers0
bidirectionalIf True, becomes a bidirectional RNNFalse

Key Takeaways

Create nn.RNN with input_size and hidden_size matching your data features.
Input tensor shape must match batch_first setting: (seq_len, batch, input_size) or (batch, seq_len, input_size).
Initialize hidden state with shape (num_layers, batch, hidden_size) or let PyTorch default to zeros.
Use only 'tanh' or 'relu' for the nonlinearity parameter.
Check output and hidden state shapes to understand RNN outputs.