PytorchHow-ToBeginner · 4 min read

How to Use nn.RNN in PyTorch: Syntax and Example

Use nn.RNN in PyTorch by creating an RNN layer with specified input size, hidden size, and number of layers. Pass input tensors of shape (seq_len, batch, input_size) to the RNN instance to get output and hidden states.

📐

Syntax

The nn.RNN class creates a simple recurrent neural network layer. Key parameters include:

input_size: Number of expected features in the input.
hidden_size: Number of features in the hidden state.
num_layers: Number of stacked RNN layers.
nonlinearity: Activation function, either 'tanh' (default) or 'relu'.
batch_first: If True, input shape is (batch, seq_len, input_size), else (seq_len, batch, input_size).

The forward pass returns output and hidden states.

python

import torch.nn as nn
rnn = nn.RNN(input_size=10, hidden_size=20, num_layers=2, nonlinearity='tanh', batch_first=False)
output, hidden = rnn(input_tensor, hidden_state)

💻

Example

This example shows how to create an nn.RNN layer, prepare input data, and run a forward pass to get outputs and hidden states.

python

import torch
import torch.nn as nn

# Parameters
input_size = 5
hidden_size = 3
num_layers = 1
seq_len = 4
batch_size = 2

# Create RNN layer
rnn = nn.RNN(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)

# Random input tensor (batch, seq_len, input_size)
input_tensor = torch.randn(batch_size, seq_len, input_size)

# Initial hidden state (num_layers, batch, hidden_size)
h0 = torch.zeros(num_layers, batch_size, hidden_size)

# Forward pass
output, hn = rnn(input_tensor, h0)

print("Output shape:", output.shape)
print("Output tensor:", output)
print("Hidden state shape:", hn.shape)
print("Hidden state tensor:", hn)

Output

Output shape: torch.Size([2, 4, 3]) Output tensor: tensor([[[-0.0310, 0.0957, 0.0426], [-0.0416, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417]], [[-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417]]], grad_fn=<StackBackward0>) Hidden state shape: torch.Size([1, 2, 3]) Hidden state tensor: tensor([[[-0.0413, 0.0956, 0.0417], [-0.0413, 0.0956, 0.0417]]], grad_fn=<StackBackward0>)

⚠️

Common Pitfalls

Input shape mismatch: The input must have shape (seq_len, batch, input_size) by default or (batch, seq_len, input_size) if batch_first=True. Mixing these causes errors.
Hidden state shape: Initial hidden state must have shape (num_layers, batch, hidden_size). Incorrect shapes cause runtime errors.
Forgetting to initialize hidden state: If not provided, PyTorch uses zeros by default, but explicit initialization is recommended for clarity.
Using wrong nonlinearity: Only 'tanh' and 'relu' are supported. Passing others raises errors.

python

import torch
import torch.nn as nn

rnn = nn.RNN(input_size=3, hidden_size=2, num_layers=1)

# Wrong input shape (batch, input_size, seq_len) instead of (seq_len, batch, input_size)
wrong_input = torch.randn(4, 3, 5)

try:
    output, hn = rnn(wrong_input)
except Exception as e:
    print("Error due to wrong input shape:", e)

# Correct input shape
correct_input = torch.randn(5, 4, 3)  # (seq_len, batch, input_size)
output, hn = rnn(correct_input)
print("Output shape with correct input:", output.shape)

Output

Error due to wrong input shape: Expected 3-dimensional input for 3-dimensional weight tensor, but got input of size: [4, 3, 5] Output shape with correct input: torch.Size([5, 4, 2])

📊

Quick Reference

Parameter	Description	Default
input_size	Number of features in input	Required
hidden_size	Number of features in hidden state	Required
num_layers	Number of stacked RNN layers	1
nonlinearity	Activation function: 'tanh' or 'relu'	'tanh'
batch_first	If True, input shape is (batch, seq_len, input_size)	False
bias	If False, no bias in RNN	True
dropout	Dropout probability between layers	0
bidirectional	If True, becomes a bidirectional RNN	False

✅

Key Takeaways

Create nn.RNN with input_size and hidden_size matching your data features.

Input tensor shape must match batch_first setting: (seq_len, batch, input_size) or (batch, seq_len, input_size).

Initialize hidden state with shape (num_layers, batch, hidden_size) or let PyTorch default to zeros.

Use only 'tanh' or 'relu' for the nonlinearity parameter.

Check output and hidden state shapes to understand RNN outputs.