Fine-tuning helps a model learn new tasks faster by starting from a model that already knows something similar.
Fine-tuning approach in Computer Vision
Start learning this pattern below
Jump into concepts and practice - no test required
1. Load a pre-trained model. 2. Freeze some layers to keep old knowledge. 3. Replace or add new layers for your task. 4. Train the new layers on your data. 5. Optionally unfreeze some layers and train more.
Freezing layers means their weights do not change during training.
Replacing the last layer is common to match the number of classes in your task.
from tensorflow.keras.applications import MobileNetV2 from tensorflow.keras.layers import Dense from tensorflow.keras.models import Model base_model = MobileNetV2(weights='imagenet', include_top=False, pooling='avg') for layer in base_model.layers: layer.trainable = False output = Dense(5, activation='softmax')(base_model.output) model = Model(inputs=base_model.input, outputs=output)
for layer in model.layers[-10:]: layer.trainable = True model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) model.fit(train_data, epochs=5)
This program shows how to fine-tune a pre-trained MobileNetV2 model on a small dummy dataset with 5 classes. It first trains only the new output layer, then unfreezes some layers to improve learning.
import tensorflow as tf from tensorflow.keras.applications import MobileNetV2 from tensorflow.keras.layers import Dense from tensorflow.keras.models import Model from tensorflow.keras.optimizers import Adam from tensorflow.keras.utils import to_categorical import numpy as np # Create dummy data: 100 images 96x96x3, 5 classes x_train = np.random.rand(100, 96, 96, 3).astype('float32') y_train = to_categorical(np.random.randint(5, size=100), num_classes=5) # Load pre-trained MobileNetV2 without top layers base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(96,96,3), pooling='avg') # Freeze base model layers for layer in base_model.layers: layer.trainable = False # Add new output layer for 5 classes output = Dense(5, activation='softmax')(base_model.output) model = Model(inputs=base_model.input, outputs=output) # Compile model model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy']) # Train only new layers history = model.fit(x_train, y_train, epochs=3, batch_size=10, verbose=2) # Unfreeze last 20 layers for fine-tuning for layer in base_model.layers[-20:]: layer.trainable = True # Recompile with lower learning rate model.compile(optimizer=Adam(1e-5), loss='categorical_crossentropy', metrics=['accuracy']) # Continue training history_fine = model.fit(x_train, y_train, epochs=2, batch_size=10, verbose=2)
Fine-tuning works best when your new task is similar to the original task the model was trained on.
Start by training only new layers, then gradually unfreeze more layers to avoid losing old knowledge.
Use a smaller learning rate when fine-tuning to make small adjustments.
Fine-tuning reuses a pre-trained model to learn new tasks faster.
Freeze old layers first, then train new layers, and finally unfreeze some layers to improve.
Use smaller learning rates during fine-tuning for better results.
Practice
Solution
Step 1: Understand fine-tuning concept
Fine-tuning means starting from a model already trained on a related task.Step 2: Identify the benefit
This approach saves time and data by reusing learned features for a new task.Final Answer:
To adapt the model to a new task using less data and time -> Option AQuick Check:
Fine-tuning = adapt pre-trained model fast [OK]
- Thinking fine-tuning trains from scratch
- Assuming fine-tuning always increases model size
- Confusing fine-tuning with pruning layers
Solution
Step 1: Recall PyTorch freezing syntax
In PyTorch, freezing means setting requires_grad = False for parameters.Step 2: Match code to syntax
for param in model.parameters(): param.requires_grad = False correctly loops over parameters and disables gradient updates.Final Answer:
for param in model.parameters(): param.requires_grad = False -> Option AQuick Check:
Freeze layers = requires_grad False [OK]
- Using non-existent methods like freeze_layers()
- Setting model.trainable instead of parameters
- Confusing trainable True/False for freezing
print(sum(p.requires_grad for p in model.parameters())) after freezing layers?Solution
Step 1: Understand freezing effect on requires_grad
Freezing sets requires_grad = False for all parameters.Step 2: Calculate sum of requires_grad flags
Since all are False, sum counts zero True values.Final Answer:
0 -> Option CQuick Check:
All frozen means requires_grad sum = 0 [OK]
- Assuming sum counts total parameters
- Thinking sum counts unfrozen parameters without freezing
- Expecting an error from requires_grad attribute
Solution
Step 1: Identify learning rate impact
A very high learning rate can cause unstable training and no improvement.Step 2: Evaluate other options
Freezing all layers prevents learning but usually keeps baseline accuracy; pre-trained models help; adding untrained layers alone doesn't prevent improvement if trained.Final Answer:
Using a very high learning rate during fine-tuning -> Option DQuick Check:
High learning rate = no improvement [OK]
- Ignoring learning rate effects
- Assuming freezing all layers always improves
- Thinking training from scratch is better always
Solution
Step 1: Replace final layer for new classes
Adjust output layer to match 5 classes for the new task.Step 2: Freeze old layers and train new layer first
This preserves learned features and trains new output layer quickly.Step 3: Unfreeze some layers and fine-tune with low learning rate
This improves model performance by adapting features carefully without large updates.Final Answer:
Freeze all layers, replace final layer with 5 outputs, train only final layer, then unfreeze some layers and fine-tune with low learning rate -> Option BQuick Check:
Stepwise fine-tuning with low LR = best practice [OK]
- Training all layers at once with high learning rate
- Training from scratch ignoring pre-trained weights
- Freezing final layer instead of earlier layers
