How to Use pack_padded_sequence in PyTorch for Variable Length Sequences
Use
pack_padded_sequence in PyTorch to convert padded variable-length sequences into a packed format that RNNs can process efficiently. Provide the padded input tensor and a list of sequence lengths sorted in descending order to pack_padded_sequence, then feed the packed data to your RNN.Syntax
The pack_padded_sequence function converts padded sequences into a packed sequence object for RNNs. It requires the padded input tensor and the lengths of each sequence in the batch.
input: padded tensor of shape (max_seq_len, batch_size, features)lengths: list or tensor of sequence lengths (must be sorted descending)batch_first: if True, input shape is (batch_size, max_seq_len, features)enforce_sorted: if False, input lengths do not need to be sorted
python
torch.nn.utils.rnn.pack_padded_sequence(input, lengths, batch_first=False, enforce_sorted=True)
Example
This example shows how to prepare padded sequences and lengths, use pack_padded_sequence, and feed the packed data to an RNN. It prints the RNN output shape and the unpacked output to verify correctness.
python
import torch from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence import torch.nn as nn # Sample data: 3 sequences of different lengths sequences = [torch.tensor([1, 2, 3, 4]), torch.tensor([5, 6]), torch.tensor([7, 8, 9])] # Pad sequences to max length 4 padded = nn.utils.rnn.pad_sequence(sequences, batch_first=True) # shape: (batch, max_len) # Lengths of each sequence lengths = torch.tensor([4, 2, 3]) # Sort by length descending (required by default) lengths, perm_idx = lengths.sort(descending=True) padded = padded[perm_idx] # Embed input (for example, embedding size 2) embedding = nn.Embedding(10, 2) embedded = embedding(padded) # shape: (batch, max_len, embed_dim) # Pack padded sequence packed_input = pack_padded_sequence(embedded, lengths, batch_first=True) # Define RNN rnn = nn.GRU(input_size=2, hidden_size=3, batch_first=True) # Forward pass packed_output, hidden = rnn(packed_input) # Unpack output output, _ = pad_packed_sequence(packed_output, batch_first=True) print('Packed output data shape:', packed_output.data.shape) print('Unpacked output shape:', output.shape) print('Hidden state shape:', hidden.shape)
Output
Packed output data shape: torch.Size([9, 3])
Unpacked output shape: torch.Size([3, 4, 3])
Hidden state shape: torch.Size([1, 3, 3])
Common Pitfalls
- Not sorting lengths descending: By default,
pack_padded_sequencerequires lengths sorted from longest to shortest. Forgetting this causes errors or wrong results. - Mismatch between
batch_firstand input shape: If your input tensor shape is (batch, seq_len, features), setbatch_first=True. Otherwise, the function will misinterpret dimensions. - Feeding padded sequences directly to RNN: RNNs expect packed sequences for variable lengths to avoid processing padded steps.
python
import torch from torch.nn.utils.rnn import pack_padded_sequence # Wrong: lengths not sorted padded = torch.randn(2, 3, 4) # batch_first=True lengths = torch.tensor([2, 3]) # not sorted descending try: packed = pack_padded_sequence(padded, lengths, batch_first=True) except Exception as e: print('Error:', e) # Right: sort lengths and reorder padded lengths, perm_idx = lengths.sort(descending=True) padded = padded[perm_idx] packed = pack_padded_sequence(padded, lengths, batch_first=True) print('Packed sequence created successfully')
Output
Error: batch_sizes array must be decreasing
Packed sequence created successfully
Quick Reference
Remember these tips when using pack_padded_sequence:
- Sort sequence lengths in descending order unless
enforce_sorted=False. - Match
batch_firstargument to your input tensor shape. - Use
pad_packed_sequenceto convert packed output back to padded form. - Use packed sequences to speed up RNN training on variable-length data.
Key Takeaways
Always sort sequence lengths descending before using pack_padded_sequence unless you set enforce_sorted=False.
Set batch_first=True if your input tensor shape is (batch_size, seq_len, features).
Use pack_padded_sequence to efficiently process variable-length sequences in RNNs.
Unpack RNN outputs with pad_packed_sequence to get padded tensor outputs.
Incorrect length sorting or shape mismatch causes runtime errors or wrong results.