PytorchHow-ToBeginner · 4 min read

How to Use torchvision.models in PyTorch for Pretrained Models

Use torchvision.models to load pretrained deep learning models easily by calling functions like models.resnet18(pretrained=True). These models can be used for image classification or fine-tuning by passing input tensors and getting predictions.

📐

Syntax

The basic syntax to load a pretrained model from torchvision.models is:

models.model_name(pretrained=True) loads a model with pretrained weights.
model.eval() sets the model to evaluation mode for inference.
Input images must be transformed to tensors and normalized before passing to the model.

python

import torchvision.models as models

# Load a pretrained ResNet18 model
model = models.resnet18(pretrained=True)

# Set model to evaluation mode
model.eval()

💻

Example

This example loads a pretrained ResNet18 model, prepares a sample image, and runs a forward pass to get predictions.

python

import torch
from torchvision import models, transforms
from PIL import Image
import requests

# Load pretrained ResNet18
model = models.resnet18(pretrained=True)
model.eval()

# Image preprocessing pipeline
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

# Load an example image from the web
url = 'https://upload.wikimedia.org/wikipedia/commons/2/26/YellowLabradorLooking_new.jpg'
image = Image.open(requests.get(url, stream=True).raw)

# Preprocess the image
input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0)  # create batch dimension

# Run inference
with torch.no_grad():
    output = model(input_batch)

# Get predicted class index
predicted_idx = torch.argmax(output, dim=1).item()

print(f'Predicted class index: {predicted_idx}')

Output

Predicted class index: 208

⚠️

Common Pitfalls

Forgetting to call model.eval() before inference can cause inconsistent results due to dropout or batch normalization layers.
Not preprocessing input images correctly (resize, crop, normalize) leads to poor predictions.
Passing inputs without a batch dimension (use unsqueeze(0)) will cause shape errors.
Using pretrained=True is deprecated in newer torchvision versions; use weights=models.ResNet18_Weights.DEFAULT instead.

python

import torchvision.models as models

# Deprecated way (older versions)
# model = models.resnet18(pretrained=True)

# Correct way (torchvision >= 0.13)
weights = models.ResNet18_Weights.DEFAULT
model = models.resnet18(weights=weights)
model.eval()

📊

Quick Reference

Here are some common model functions and their usage:

Model	Load Pretrained Syntax
ResNet18	models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
AlexNet	models.alexnet(weights=models.AlexNet_Weights.DEFAULT)
VGG16	models.vgg16(weights=models.VGG16_Weights.DEFAULT)
MobileNetV2	models.mobilenet_v2(weights=models.MobileNet_V2_Weights.DEFAULT)
DenseNet121	models.densenet121(weights=models.DenseNet121_Weights.DEFAULT)

✅

Key Takeaways

Always load pretrained models using the weights parameter for compatibility with latest torchvision versions.

Preprocess input images with resize, crop, tensor conversion, and normalization before passing to the model.

Set the model to evaluation mode with model.eval() before running inference to get consistent results.

Add a batch dimension to inputs using unsqueeze(0) to avoid shape errors.

Use torchvision.models to quickly access popular pretrained models for image tasks.