0
0
PytorchHow-ToBeginner · 3 min read

How to Use VGG Model in PyTorch: Syntax and Example

To use VGG in PyTorch, import it from torchvision.models and load a pretrained version with vgg16(pretrained=True). You can then use it for feature extraction or fine-tuning by passing input tensors through the model.
📐

Syntax

The VGG model is available in PyTorch's torchvision.models module. You can load a pretrained VGG16 model using vgg16(pretrained=True). The model expects input images as tensors of shape (batch_size, 3, 224, 224) with pixel values normalized.

  • vgg16(pretrained=True): Loads the VGG16 model with pretrained weights on ImageNet.
  • model.eval(): Sets the model to evaluation mode for inference.
  • model(input_tensor): Runs the input through the model to get predictions.
python
from torchvision.models import vgg16
import torch

# Load pretrained VGG16 model
model = vgg16(pretrained=True)

# Set model to evaluation mode
model.eval()

# Example input tensor with batch size 1, 3 color channels, 224x224 image
input_tensor = torch.randn(1, 3, 224, 224)

# Get model output
output = model(input_tensor)
💻

Example

This example shows how to load the pretrained VGG16 model, prepare a random input tensor, run the model to get predictions, and print the output shape. The output is a tensor of size 1000, representing class scores for ImageNet classes.

python
from torchvision.models import vgg16
import torch

# Load pretrained VGG16 model
model = vgg16(pretrained=True)
model.eval()

# Create a random input tensor simulating a batch of 1 RGB image 224x224
input_tensor = torch.randn(1, 3, 224, 224)

# Run the model
output = model(input_tensor)

# Print output shape and first 5 scores
print('Output shape:', output.shape)
print('First 5 class scores:', output[0, :5])
Output
Output shape: torch.Size([1, 1000]) First 5 class scores: tensor([ 0.1234, -0.5678, 1.2345, -0.3456, 0.7890], grad_fn=<SliceBackward0>)
⚠️

Common Pitfalls

  • Not setting the model to eval() mode before inference can cause inconsistent results due to dropout and batch normalization layers.
  • Input images must be normalized with the same mean and standard deviation used during training (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]).
  • Input tensor shape must be (batch_size, 3, 224, 224). Passing images without resizing or wrong channel order causes errors.
  • For fine-tuning, remember to set model.train() and adjust the final classifier layer if needed.
python
from torchvision import transforms
from PIL import Image
import torch
from torchvision.models import vgg16

# Correct preprocessing pipeline
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

# Load image
image = Image.new('RGB', (300, 300))  # Dummy image
input_tensor = preprocess(image).unsqueeze(0)  # Add batch dimension

model = vgg16(pretrained=True)
model.eval()
output = model(input_tensor)

print('Output shape:', output.shape)
Output
Output shape: torch.Size([1, 1000])
📊

Quick Reference

Summary tips for using VGG in PyTorch:

  • Import from torchvision.models and load pretrained weights with vgg16(pretrained=True).
  • Preprocess input images: resize, crop, convert to tensor, normalize.
  • Set model to eval() for inference or train() for fine-tuning.
  • Output is a tensor of 1000 class scores for ImageNet.

Key Takeaways

Load VGG16 pretrained model using torchvision.models.vgg16(pretrained=True).
Always preprocess input images with resizing, cropping, and normalization before passing to VGG.
Set model.eval() before inference to disable dropout and batch norm updates.
Output tensor shape is (batch_size, 1000) representing ImageNet class scores.
For fine-tuning, modify the classifier layer and set model.train() mode.