0
0
PyTorchml~5 mins

Hugging Face integration basics in PyTorch

Choose your learning style9 modes available
Introduction
Hugging Face makes it easy to use powerful language models without building them from scratch.
You want to quickly try a pre-trained language model for text classification.
You need to generate text like writing a story or answering questions.
You want to fine-tune a model on your own small dataset.
You want to use state-of-the-art models without deep knowledge of their internals.
Syntax
PyTorch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load tokenizer and model
model_name = 'distilbert-base-uncased-finetuned-sst-2-english'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Prepare input text
inputs = tokenizer('I love learning AI!', return_tensors='pt')

# Get model output
outputs = model(**inputs)

# Get prediction
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
Use AutoTokenizer and AutoModel classes to load pre-trained models easily.
Return tensors in PyTorch format with return_tensors='pt' for compatibility.
Examples
Load a base BERT model and tokenizer for classification tasks.
PyTorch
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')
Tokenize text with padding and truncation for batch processing.
PyTorch
inputs = tokenizer('Hello world!', padding=True, truncation=True, return_tensors='pt')
Run the model on inputs and get raw prediction scores (logits).
PyTorch
outputs = model(**inputs)
logits = outputs.logits
Sample Model
This program loads a pre-trained sentiment analysis model, tokenizes input text, runs the model, and prints the predicted sentiment with confidence scores.
PyTorch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model_name = 'distilbert-base-uncased-finetuned-sst-2-english'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = 'I love learning AI!'
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

labels = ['negative', 'positive']
predicted_label = labels[predictions.argmax()]

print(f'Text: {text}')
print(f'Predicted sentiment: {predicted_label}')
print(f'Confidence scores: {predictions.detach().numpy()}')
OutputSuccess
Important Notes
Hugging Face models come with pre-trained weights ready to use.
Always use the matching tokenizer for the model to ensure correct input formatting.
Softmax converts raw scores to probabilities that sum to 1.
Summary
Hugging Face lets you load and use powerful language models easily.
Use AutoTokenizer and AutoModel classes to handle tokenization and model loading.
Run the model on tokenized inputs and interpret outputs with softmax for predictions.