How to Use Pretrained Models for Image Classification in Computer Vision
To use a
pretrained model for image classification, load the model with pretrained weights, preprocess your input images to match the model's requirements, and then run the model to get predictions. Popular libraries like TensorFlow and PyTorch provide easy APIs to load models like ResNet or MobileNet and classify images quickly.Syntax
Using a pretrained model typically involves these steps:
- Load the model: Import the model architecture with pretrained weights.
- Preprocess input: Resize, normalize, and format images as the model expects.
- Predict: Pass the processed image to the model to get class probabilities or labels.
python
from torchvision import models, transforms from PIL import Image import torch # Load pretrained model model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT) model.eval() # Set to evaluation mode # Define preprocessing preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # Load image img = Image.open("path_to_image.jpg") input_tensor = preprocess(img) input_batch = input_tensor.unsqueeze(0) # Create batch dimension # Predict with torch.no_grad(): output = model(input_batch) # Output is raw scores for each class
Example
This example shows how to classify an image using PyTorch's pretrained ResNet18 model. It loads an image, preprocesses it, runs the model, and prints the top predicted class.
python
from torchvision import models, transforms from PIL import Image import torch import urllib.request # Download labels for ImageNet classes url = "https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt" labels_path = "imagenet_classes.txt" urllib.request.urlretrieve(url, labels_path) with open(labels_path) as f: labels = [line.strip() for line in f.readlines()] # Load pretrained model model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT) model.eval() # Preprocessing pipeline preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # Load an example image img_url = "https://upload.wikimedia.org/wikipedia/commons/2/26/YellowLabradorLooking_new.jpg" img_path = "dog.jpg" urllib.request.urlretrieve(img_url, img_path) img = Image.open(img_path) # Prepare input input_tensor = preprocess(img) input_batch = input_tensor.unsqueeze(0) # Run prediction with torch.no_grad(): output = model(input_batch) # Get probabilities probabilities = torch.nn.functional.softmax(output[0], dim=0) # Get top 1 prediction top_prob, top_catid = torch.topk(probabilities, 1) print(f"Predicted class: {labels[top_catid]} with probability {top_prob.item():.4f}")
Output
Predicted class: Labrador_retriever with probability 0.8423
Common Pitfalls
- Not preprocessing input correctly: Models expect images resized, cropped, and normalized in a specific way.
- Forgetting to set model to evaluation mode: Use
model.eval()to disable training behaviors like dropout. - Passing wrong input shape: Models expect a batch dimension, so add one if needed.
- Ignoring device placement: For GPU use, move model and data to the same device.
python
from torchvision import models import torch # Wrong: Not setting eval mode model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT) # model.eval() # Missing this causes wrong predictions # Wrong: Passing input without batch dimension input_tensor = torch.randn(3, 224, 224) # Missing batch # output = model(input_tensor) # This will error # Right way: model.eval() input_batch = input_tensor.unsqueeze(0) # Add batch dimension output = model(input_batch)
Quick Reference
Tips for using pretrained models:
- Always preprocess images as the model expects (resize, crop, normalize).
- Set the model to evaluation mode with
model.eval(). - Add a batch dimension to inputs before prediction.
- Use softmax on outputs to get probabilities.
- Use official pretrained weights from trusted libraries.
Key Takeaways
Load pretrained models with pretrained weights from libraries like PyTorch or TensorFlow.
Preprocess input images correctly to match model requirements before prediction.
Always set the model to evaluation mode using model.eval() before inference.
Add a batch dimension to input tensors to avoid shape errors.
Use softmax on model outputs to interpret predictions as probabilities.