How to Use torchvision.models in PyTorch for Pretrained Models
Use
torchvision.models to load pretrained deep learning models easily by calling functions like models.resnet18(pretrained=True). These models can be used for image classification or fine-tuning by passing input tensors and getting predictions.Syntax
The basic syntax to load a pretrained model from torchvision.models is:
models.model_name(pretrained=True)loads a model with pretrained weights.model.eval()sets the model to evaluation mode for inference.- Input images must be transformed to tensors and normalized before passing to the model.
python
import torchvision.models as models # Load a pretrained ResNet18 model model = models.resnet18(pretrained=True) # Set model to evaluation mode model.eval()
Example
This example loads a pretrained ResNet18 model, prepares a sample image, and runs a forward pass to get predictions.
python
import torch from torchvision import models, transforms from PIL import Image import requests # Load pretrained ResNet18 model = models.resnet18(pretrained=True) model.eval() # Image preprocessing pipeline preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # Load an example image from the web url = 'https://upload.wikimedia.org/wikipedia/commons/2/26/YellowLabradorLooking_new.jpg' image = Image.open(requests.get(url, stream=True).raw) # Preprocess the image input_tensor = preprocess(image) input_batch = input_tensor.unsqueeze(0) # create batch dimension # Run inference with torch.no_grad(): output = model(input_batch) # Get predicted class index predicted_idx = torch.argmax(output, dim=1).item() print(f'Predicted class index: {predicted_idx}')
Output
Predicted class index: 208
Common Pitfalls
- Forgetting to call
model.eval()before inference can cause inconsistent results due to dropout or batch normalization layers. - Not preprocessing input images correctly (resize, crop, normalize) leads to poor predictions.
- Passing inputs without a batch dimension (use
unsqueeze(0)) will cause shape errors. - Using
pretrained=Trueis deprecated in newer torchvision versions; useweights=models.ResNet18_Weights.DEFAULTinstead.
python
import torchvision.models as models # Deprecated way (older versions) # model = models.resnet18(pretrained=True) # Correct way (torchvision >= 0.13) weights = models.ResNet18_Weights.DEFAULT model = models.resnet18(weights=weights) model.eval()
Quick Reference
Here are some common model functions and their usage:
| Model | Load Pretrained Syntax |
|---|---|
| ResNet18 | models.resnet18(weights=models.ResNet18_Weights.DEFAULT) |
| AlexNet | models.alexnet(weights=models.AlexNet_Weights.DEFAULT) |
| VGG16 | models.vgg16(weights=models.VGG16_Weights.DEFAULT) |
| MobileNetV2 | models.mobilenet_v2(weights=models.MobileNet_V2_Weights.DEFAULT) |
| DenseNet121 | models.densenet121(weights=models.DenseNet121_Weights.DEFAULT) |
Key Takeaways
Always load pretrained models using the weights parameter for compatibility with latest torchvision versions.
Preprocess input images with resize, crop, tensor conversion, and normalization before passing to the model.
Set the model to evaluation mode with model.eval() before running inference to get consistent results.
Add a batch dimension to inputs using unsqueeze(0) to avoid shape errors.
Use torchvision.models to quickly access popular pretrained models for image tasks.