PytorchHow-ToBeginner · 4 min read

How to Use Huggingface with PyTorch: Simple Guide

To use Huggingface with PyTorch, install the transformers library, load a pretrained model and tokenizer with from_pretrained, then pass PyTorch tensors to the model for training or inference. Huggingface models are fully compatible with PyTorch and provide easy APIs for NLP tasks.

📐

Syntax

Here is the basic syntax to load a Huggingface model and tokenizer with PyTorch:

from transformers import AutoModelForSequenceClassification, AutoTokenizer: Import model and tokenizer classes.
tokenizer = AutoTokenizer.from_pretrained('model-name'): Load tokenizer for text processing.
model = AutoModelForSequenceClassification.from_pretrained('model-name'): Load pretrained PyTorch model.
inputs = tokenizer(text, return_tensors='pt'): Convert text to PyTorch tensors.
outputs = model(**inputs): Run model forward pass.

python

from transformers import AutoModelForSequenceClassification, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')

text = "Hello, Huggingface with PyTorch!"
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)

💻

Example

This example shows how to load a pretrained BERT model for sentiment classification, tokenize input text, and get model predictions using PyTorch tensors.

python

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load tokenizer and model
model_name = 'distilbert-base-uncased-finetuned-sst-2-english'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Prepare input text
text = "I love using Huggingface with PyTorch!"
inputs = tokenizer(text, return_tensors='pt')

# Forward pass
outputs = model(**inputs)

# Get logits and predicted class
logits = outputs.logits
detected_class = torch.argmax(logits).item()

print(f"Logits: {logits}")
print(f"Predicted class: {detected_class}")

Output

Logits: tensor([[ 3.2157, -3.2157]], grad_fn=<AddmmBackward0>) Predicted class: 0

⚠️

Common Pitfalls

Not using return_tensors='pt' in tokenizer causes errors because model expects PyTorch tensors.
Forgetting to move model and inputs to the same device (CPU or GPU) leads to device mismatch errors.
Using the wrong model class for the task (e.g., loading a masked language model for classification) causes unexpected outputs.
Not setting the model to evaluation mode (model.eval()) during inference can affect results.

python

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Wrong: tokenizer returns lists, not tensors
model_name = 'bert-base-uncased'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "Test"
inputs = tokenizer(text)  # Missing return_tensors='pt'

# This will cause an error:
# outputs = model(**inputs)

# Correct way:
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)

📊

Quick Reference

Step	Command	Purpose
1	pip install transformers torch	Install required libraries
2	from transformers import AutoModelForSequenceClassification, AutoTokenizer	Import model and tokenizer classes
3	tokenizer = AutoTokenizer.from_pretrained('model-name')	Load tokenizer for text processing
4	model = AutoModelForSequenceClassification.from_pretrained('model-name')	Load pretrained PyTorch model
5	inputs = tokenizer(text, return_tensors='pt')	Convert text to PyTorch tensors
6	outputs = model(**inputs)	Run model forward pass
7	model.eval()	Set model to evaluation mode for inference

✅

Key Takeaways

Install the transformers library to access Huggingface models compatible with PyTorch.

Always use tokenizer with return_tensors='pt' to get PyTorch tensors as input.

Load the correct model class for your task, e.g., sequence classification for sentiment analysis.

Move model and inputs to the same device (CPU or GPU) to avoid errors.

Set model.eval() during inference to get consistent results.