How to Use ResNet for Image Classification in Computer Vision
To use
ResNet for classification in computer vision, load a pretrained ResNet model, replace its final layer to match your number of classes, and train or fine-tune it on your dataset. Use frameworks like PyTorch to easily load and modify ResNet architectures for your classification task.Syntax
Here is the typical syntax to use a pretrained ResNet model for classification:
torchvision.models.resnet50(weights=models.ResNet50_Weights.DEFAULT): Loads ResNet-50 with pretrained weights.- Replace the final fully connected layer
model.fcto match your number of classes. - Use an optimizer and loss function to train or fine-tune the model.
python
import torch import torchvision.models as models # Load pretrained ResNet-50 model = models.resnet50(weights=models.ResNet50_Weights.DEFAULT) # Replace final layer for 10 classes model.fc = torch.nn.Linear(model.fc.in_features, 10)
Example
This example shows how to load ResNet-18 pretrained on ImageNet, replace its final layer for 3 classes, and run a forward pass on a dummy image tensor.
python
import torch import torchvision.models as models # Load pretrained ResNet-18 model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT) # Replace final layer for 3 classes model.fc = torch.nn.Linear(model.fc.in_features, 3) # Create dummy input (batch size 1, 3 color channels, 224x224 image) dummy_input = torch.randn(1, 3, 224, 224) # Forward pass output = model(dummy_input) print('Output shape:', output.shape) print('Output values:', output)
Output
Output shape: torch.Size([1, 3])
Output values: tensor([[ 0.1234, -0.5678, 0.9101]], grad_fn=<AddmmBackward0>)
Common Pitfalls
- Not replacing the final layer to match your dataset's number of classes causes shape mismatch errors.
- Forgetting to set the model to
train()mode during training oreval()mode during evaluation affects batch normalization and dropout layers. - Not normalizing input images with the same mean and std used during ResNet training leads to poor performance.
- Using a learning rate too high can cause training to diverge when fine-tuning.
python
import torch import torchvision.models as models # Wrong: Using pretrained ResNet without changing final layer for 5 classes model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT) # model.fc still outputs 1000 classes (ImageNet) # Right: Replace final layer model.fc = torch.nn.Linear(model.fc.in_features, 5) # Set model to train mode model.train()
Quick Reference
| Step | Description |
|---|---|
| Load pretrained ResNet | Use torchvision.models.resnetXX(weights=models.ResNetXX_Weights.DEFAULT) |
| Modify final layer | Replace model.fc with Linear(in_features, num_classes) |
| Prepare data | Normalize images with ImageNet mean and std |
| Train or fine-tune | Use optimizer and loss function, set model.train() |
| Evaluate | Set model.eval() and disable gradients |
Key Takeaways
Always replace ResNet's final layer to match your classification classes.
Normalize input images using ImageNet mean and standard deviation.
Set model.train() during training and model.eval() during evaluation.
Use pretrained weights to speed up training and improve accuracy.
Watch learning rate carefully when fine-tuning to avoid training issues.