Feature extraction helps us use important parts of data to teach a model faster and better. It saves time and improves results by focusing on useful information.
Feature extraction strategy in PyTorch
Start learning this pattern below
Jump into concepts and practice - no test required
import torch import torchvision.models as models # Load a pre-trained model model = models.resnet18(pretrained=True) # Freeze all layers to prevent training for param in model.parameters(): param.requires_grad = False # Replace the final layer to match your task model.fc = torch.nn.Linear(model.fc.in_features, num_classes) # Now only the final layer will be trained
Freezing layers means their weights won't change during training.
Replacing the final layer adapts the model to your specific problem.
model = models.resnet18(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = torch.nn.Linear(model.fc.in_features, 10)
model = models.vgg16(pretrained=True) for param in model.features.parameters(): param.requires_grad = False model.classifier[6] = torch.nn.Linear(4096, 5)
This code loads a pre-trained ResNet18, freezes all layers, replaces the last layer for 3 classes, and runs dummy data through it. It prints the output shape and how many parameters will be trained (should be 1 layer).
import torch import torchvision.models as models import torch.nn as nn # Number of classes for new task num_classes = 3 # Load pre-trained ResNet18 model = models.resnet18(pretrained=True) # Freeze all layers for param in model.parameters(): param.requires_grad = False # Replace final fully connected layer model.fc = nn.Linear(model.fc.in_features, num_classes) # Create dummy input (batch size 2, 3 color channels, 224x224 image) dummy_input = torch.randn(2, 3, 224, 224) # Get output predictions output = model(dummy_input) # Print output shape and requires_grad status of parameters print(f"Output shape: {output.shape}") trainable_params = [p for p in model.parameters() if p.requires_grad] print(f"Number of trainable parameters: {len(trainable_params)}")
Freezing layers helps keep learned features and reduces training time.
Only the replaced final layer's parameters require gradients and will update during training.
Use dummy inputs with correct shape to test model output before training.
Feature extraction uses pre-trained models to get useful data features.
Freeze layers to keep their knowledge and train only new parts.
Replace the final layer to fit your specific task.
Practice
Solution
Step 1: Understand feature extraction concept
Feature extraction uses a model already trained on a large dataset to get useful features without training all layers again.Step 2: Identify the main benefit
This saves time and resources by reusing learned knowledge instead of starting from scratch.Final Answer:
To use learned features from a large dataset and avoid training from scratch -> Option BQuick Check:
Feature extraction = reuse learned features [OK]
- Thinking feature extraction means training all layers
- Confusing feature extraction with data augmentation
- Believing optimizer changes are part of feature extraction
Solution
Step 1: Freeze all layers by setting requires_grad to false
The loop disables gradient updates for all parameters to keep pre-trained weights fixed.Step 2: Replace the final layer with a new one to train
Assigning a new linear layer to model.fc allows training only this layer for the new task.Final Answer:
for param in model.parameters(): param.requires_grad = False model.fc = nn.Linear(512, 10) -> Option CQuick Check:
Freeze all except final layer = for param in model.parameters(): param.requires_grad = False model.fc = nn.Linear(512, 10) [OK]
- Not freezing layers before replacing final layer
- Freezing final layer instead of others
- Setting requires_grad true for all parameters
features?
import torch import torchvision.models as models model = models.resnet18(pretrained=True) model.fc = torch.nn.Identity() input_tensor = torch.randn(4, 3, 224, 224) features = model(input_tensor) print(features.shape)
Solution
Step 1: Understand model modification
Replacing model.fc with Identity removes the final classification layer, so output is the feature vector before classification.Step 2: Know ResNet18 feature size
ResNet18 outputs a 512-dimensional vector before the final fc layer for each input image.Final Answer:
torch.Size([4, 512]) -> Option AQuick Check:
ResNet18 features = 512 dims [OK]
- Assuming output is 1000 classes without removing fc
- Confusing batch size with feature dimension
- Expecting 2048 features from ResNet18 (it's 512)
model = models.resnet50(pretrained=True)
for param in model.parameters():
param.requires_grad = False
model.fc = nn.Linear(2048, 5)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
# Training loop hereSolution
Step 1: Check freezing timing
The loop freezes existing parameters before replacing model.fc, so the new fc layer's parameters are created with requires_grad=True by default.Step 2: Verify optimizer behavior
Optimizer only updates parameters where requires_grad=True, which are the new fc parameters; backbone remains frozen.Final Answer:
No error; code is correct -> Option AQuick Check:
New layer params unfrozen by default [OK]
- Assuming freezing all parameters includes new layers
- Changing optimizer without fixing requires_grad
- Removing freezing unnecessarily
Solution
Step 1: Understand freezing impact
Freezing all but last layer may limit model's ability to adapt features to new classes, causing low accuracy.Step 2: Fine-tune some deeper layers
Unfreezing some layers closer to output allows the model to adjust features better for your specific dataset.Final Answer:
Unfreeze some deeper layers to fine-tune features for your task -> Option DQuick Check:
Fine-tune layers = better adaptation [OK]
- Increasing learning rate too much causes instability
- Changing optimizer without addressing feature adaptation
- Reducing batch size unnecessarily
