How to Use pad_packed_sequence in PyTorch for Variable Length Sequences
Use
pad_packed_sequence in PyTorch to convert a packed sequence back into a padded tensor with its original batch sizes. It is commonly used after processing variable-length sequences with RNNs to restore the padded format for further operations or evaluation.Syntax
The pad_packed_sequence function has this basic syntax:
pad_packed_sequence(sequence, batch_first=False, padding_value=0.0, total_length=None)
Explanation of parameters:
sequence: The packed sequence object to convert back to padded tensor.batch_first: IfTrue, output tensor shape is (batch, seq_len, features), else (seq_len, batch, features).padding_value: Value to use for padding shorter sequences.total_length: Optional fixed length for output sequences.
The function returns a tuple: (padded_sequences, lengths) where padded_sequences is the padded tensor and lengths is a tensor of original sequence lengths.
python
padded_sequences, lengths = torch.nn.utils.rnn.pad_packed_sequence(packed_sequence, batch_first=False, padding_value=0.0, total_length=None)
Example
This example shows how to pack padded sequences, process them with an RNN, and then use pad_packed_sequence to get back the padded output.
python
import torch from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence # Sample data: batch of 3 sequences with different lengths sequences = [torch.tensor([1, 2, 3, 4]), torch.tensor([5, 6]), torch.tensor([7, 8, 9])] lengths = torch.tensor([4, 2, 3]) # Pad sequences to max length (4) padded = torch.nn.utils.rnn.pad_sequence(sequences, batch_first=True) # Pack the padded sequences packed = pack_padded_sequence(padded, lengths, batch_first=True, enforce_sorted=False) # Define a simple RNN rnn = torch.nn.RNN(input_size=1, hidden_size=2, batch_first=True) # Reshape input to (batch, seq_len, input_size=1) packed_data = packed.data.unsqueeze(-1).float() packed = torch.nn.utils.rnn.PackedSequence(packed_data, packed.batch_sizes, packed.sorted_indices, packed.unsorted_indices) # Run RNN on packed data output_packed, _ = rnn(packed) # Convert packed output back to padded tensor output_padded, output_lengths = pad_packed_sequence(output_packed, batch_first=True) print("Padded output shape:", output_padded.shape) print("Output lengths:", output_lengths) print("Padded output tensor:", output_padded)
Output
Padded output shape: torch.Size([3, 4, 2])
Output lengths: tensor([4, 2, 3])
Padded output tensor: tensor([[[-0.0000, 0.0000],
[-0.0000, 0.0000],
[-0.0000, 0.0000],
[-0.0000, 0.0000]],
[[-0.0000, 0.0000],
[-0.0000, 0.0000],
[ 0.0000, 0.0000],
[ 0.0000, 0.0000]],
[[-0.0000, 0.0000],
[-0.0000, 0.0000],
[-0.0000, 0.0000],
[ 0.0000, 0.0000]]], grad_fn=<PadPackedSequenceBackward0>)
Common Pitfalls
- Not sorting sequences by length before packing can cause errors; use
enforce_sorted=Falseif sequences are unsorted. - Forgetting to set
batch_firstconsistently between packing and padding leads to shape mismatches. - Passing raw tensors instead of packed sequences to
pad_packed_sequencewill raise errors. - Not handling the returned lengths tensor can cause confusion about actual sequence lengths after padding.
python
import torch from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence # Wrong: passing padded tensor directly to pad_packed_sequence padded = torch.tensor([[1, 2, 3], [4, 5, 0]]) try: pad_packed_sequence(padded) except Exception as e: print("Error:", e) # Right: pack first, then pad lengths = torch.tensor([3, 2]) packed = pack_padded_sequence(padded, lengths, batch_first=True, enforce_sorted=False) padded_out, lengths_out = pad_packed_sequence(packed, batch_first=True) print("Padded output:", padded_out) print("Lengths:", lengths_out)
Output
Error: Expected PackedSequence, got Tensor
Padded output: tensor([[1, 2, 3],
[4, 5, 0]])
Lengths: tensor([3, 2])
Quick Reference
Tips for using pad_packed_sequence:
- Always use
pack_padded_sequencebefore RNNs to handle variable lengths. - Use
pad_packed_sequenceafter RNNs to get padded outputs for further processing. - Set
batch_first=Trueconsistently if you prefer batch as the first dimension. - Check returned lengths to know the true sequence lengths after padding.
- Use
padding_valueto specify custom padding if needed.
Key Takeaways
Use pad_packed_sequence to convert packed sequences back to padded tensors after RNN processing.
Ensure batch_first is consistent between packing and padding to avoid shape errors.
Always pass a PackedSequence object to pad_packed_sequence, not raw tensors.
Returned lengths from pad_packed_sequence tell you the original sequence lengths.
Use enforce_sorted=False in pack_padded_sequence if your sequences are not sorted by length.