How to Use Pretrained Transformer Models in PyTorch
To use a pretrained transformer in PyTorch, import the model and tokenizer from the Hugging Face Transformers library, load the pretrained weights with
from_pretrained(), and then pass your input text through the tokenizer and model to get predictions. This lets you leverage powerful language models without training from scratch.Syntax
Using a pretrained transformer in PyTorch typically involves these steps:
- Import the tokenizer and model from
transformers. - Load pretrained weights using
from_pretrained()method. - Tokenize input text to convert it into model-readable format.
- Pass tokens to the model to get output predictions.
This pattern works for many transformer models like BERT, GPT-2, RoBERTa, etc.
python
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') model = AutoModel.from_pretrained('bert-base-uncased') inputs = tokenizer('Hello world!', return_tensors='pt') outputs = model(**inputs)
Example
This example shows how to load a pretrained BERT model and tokenizer, tokenize a sentence, and get the model's last hidden states as output.
python
from transformers import AutoTokenizer, AutoModel import torch # Load pretrained tokenizer and model tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') model = AutoModel.from_pretrained('bert-base-uncased') # Prepare input text text = 'PyTorch makes using transformers easy!' inputs = tokenizer(text, return_tensors='pt') # Get model outputs with torch.no_grad(): outputs = model(**inputs) # outputs.last_hidden_state shape: (batch_size, sequence_length, hidden_size) print('Output tensor shape:', outputs.last_hidden_state.shape)
Output
Output tensor shape: torch.Size([1, 8, 768])
Common Pitfalls
Common mistakes when using pretrained transformers include:
- Not using the matching tokenizer and model names, which causes errors or poor results.
- Forgetting to set
return_tensors='pt'in the tokenizer, leading to wrong input types. - Passing raw text directly to the model instead of tokenized inputs.
- Not using
torch.no_grad()during inference, which wastes memory.
python
from transformers import AutoTokenizer, AutoModel # Wrong: Using tokenizer and model from different pretrained names # tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') # model = AutoModel.from_pretrained('gpt2') # Mismatch causes issues # Correct: tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') model = AutoModel.from_pretrained('bert-base-uncased') # Wrong: Forgetting return_tensors='pt' # inputs = tokenizer('Hello') # Returns dict of lists, not tensors # Correct: inputs = tokenizer('Hello', return_tensors='pt') # Wrong: Passing raw text to model # outputs = model('Hello') # Error # Correct: outputs = model(**inputs)
Quick Reference
Summary tips for using pretrained transformers in PyTorch:
- Always use matching tokenizer and model names.
- Use
return_tensors='pt'to get PyTorch tensors from tokenizer. - Wrap inference code with
torch.no_grad()to save memory. - Check model output attributes like
last_hidden_stateorlogitsdepending on your task.
Key Takeaways
Load pretrained transformers with matching tokenizer and model names using from_pretrained().
Tokenize input text with return_tensors='pt' before passing to the model.
Use torch.no_grad() during inference to reduce memory use.
Model outputs vary by architecture; check documentation for output details.
Avoid passing raw text directly to the model; always tokenize first.