Fine-tuning helps a model learn new tasks faster by starting from a model already trained on similar data.
Fine-tuning approach in TensorFlow
base_model = tf.keras.applications.MobileNetV2(input_shape=(224,224,3), include_top=False, weights='imagenet') base_model.trainable = False # Add new layers on top inputs = tf.keras.Input(shape=(224,224,3)) x = base_model(inputs, training=False) x = tf.keras.layers.GlobalAveragePooling2D()(x) outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x) model = tf.keras.Model(inputs, outputs) # Compile and train new layers model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(train_data, epochs=initial_epochs) # Unfreeze some layers for fine-tuning base_model.trainable = True for layer in base_model.layers[: -fine_tune_at]: layer.trainable = False model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(train_data, epochs=fine_tune_epochs, initial_epoch=initial_epochs)
Start by freezing the base model to train only new layers.
Then unfreeze some layers to fine-tune with a low learning rate.
base_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False) base_model.trainable = False
for layer in base_model.layers[-10:]: layer.trainable = True
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
This code shows how to fine-tune a pretrained MobileNetV2 model on new random data with 3 classes. It first trains only the new layers, then unfreezes part of the base model to fine-tune.
import tensorflow as tf # Load pretrained MobileNetV2 without top layers base_model = tf.keras.applications.MobileNetV2(input_shape=(224,224,3), include_top=False, weights='imagenet') base_model.trainable = False # Number of classes for new task num_classes = 3 # Build new model on top inputs = tf.keras.Input(shape=(224,224,3)) x = base_model(inputs, training=False) x = tf.keras.layers.GlobalAveragePooling2D()(x) outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x) model = tf.keras.Model(inputs, outputs) # Compile and train new layers model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) # Create dummy data: 100 samples of 224x224 RGB images and labels import numpy as np train_images = np.random.rand(100,224,224,3).astype('float32') train_labels = np.random.randint(0, num_classes, 100) initial_epochs = 2 model.fit(train_images, train_labels, epochs=initial_epochs) # Unfreeze last 20 layers for fine-tuning base_model.trainable = True fine_tune_at = len(base_model.layers) - 20 for layer in base_model.layers[:fine_tune_at]: layer.trainable = False # Compile with low learning rate model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy']) fine_tune_epochs = 2 model.fit(train_images, train_labels, epochs=initial_epochs + fine_tune_epochs, initial_epoch=initial_epochs)
Fine-tuning works best when the new task is similar to the original task the model learned.
Use a smaller learning rate during fine-tuning to avoid big changes to pretrained weights.
Freezing too few layers may cause overfitting if data is small.
Fine-tuning adapts a pretrained model to a new task by training some layers.
Start by training new layers with the base model frozen.
Then unfreeze some base layers and train with a low learning rate.