How to Use VGG Model in PyTorch: Syntax and Example
To use
VGG in PyTorch, import it from torchvision.models and load a pretrained version with vgg16(pretrained=True). You can then use it for feature extraction or fine-tuning by passing input tensors through the model.Syntax
The VGG model is available in PyTorch's torchvision.models module. You can load a pretrained VGG16 model using vgg16(pretrained=True). The model expects input images as tensors of shape (batch_size, 3, 224, 224) with pixel values normalized.
- vgg16(pretrained=True): Loads the VGG16 model with pretrained weights on ImageNet.
- model.eval(): Sets the model to evaluation mode for inference.
- model(input_tensor): Runs the input through the model to get predictions.
python
from torchvision.models import vgg16 import torch # Load pretrained VGG16 model model = vgg16(pretrained=True) # Set model to evaluation mode model.eval() # Example input tensor with batch size 1, 3 color channels, 224x224 image input_tensor = torch.randn(1, 3, 224, 224) # Get model output output = model(input_tensor)
Example
This example shows how to load the pretrained VGG16 model, prepare a random input tensor, run the model to get predictions, and print the output shape. The output is a tensor of size 1000, representing class scores for ImageNet classes.
python
from torchvision.models import vgg16 import torch # Load pretrained VGG16 model model = vgg16(pretrained=True) model.eval() # Create a random input tensor simulating a batch of 1 RGB image 224x224 input_tensor = torch.randn(1, 3, 224, 224) # Run the model output = model(input_tensor) # Print output shape and first 5 scores print('Output shape:', output.shape) print('First 5 class scores:', output[0, :5])
Output
Output shape: torch.Size([1, 1000])
First 5 class scores: tensor([ 0.1234, -0.5678, 1.2345, -0.3456, 0.7890], grad_fn=<SliceBackward0>)
Common Pitfalls
- Not setting the model to
eval()mode before inference can cause inconsistent results due to dropout and batch normalization layers. - Input images must be normalized with the same mean and standard deviation used during training (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]).
- Input tensor shape must be
(batch_size, 3, 224, 224). Passing images without resizing or wrong channel order causes errors. - For fine-tuning, remember to set
model.train()and adjust the final classifier layer if needed.
python
from torchvision import transforms from PIL import Image import torch from torchvision.models import vgg16 # Correct preprocessing pipeline preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # Load image image = Image.new('RGB', (300, 300)) # Dummy image input_tensor = preprocess(image).unsqueeze(0) # Add batch dimension model = vgg16(pretrained=True) model.eval() output = model(input_tensor) print('Output shape:', output.shape)
Output
Output shape: torch.Size([1, 1000])
Quick Reference
Summary tips for using VGG in PyTorch:
- Import from
torchvision.modelsand load pretrained weights withvgg16(pretrained=True). - Preprocess input images: resize, crop, convert to tensor, normalize.
- Set model to
eval()for inference ortrain()for fine-tuning. - Output is a tensor of 1000 class scores for ImageNet.
Key Takeaways
Load VGG16 pretrained model using torchvision.models.vgg16(pretrained=True).
Always preprocess input images with resizing, cropping, and normalization before passing to VGG.
Set model.eval() before inference to disable dropout and batch norm updates.
Output tensor shape is (batch_size, 1000) representing ImageNet class scores.
For fine-tuning, modify the classifier layer and set model.train() mode.