What is Learning rate for fine-tuning in TensorFlow?

TensorFlowml~5 mins

Learning rate for fine-tuning in TensorFlow

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Introduction

Learning rate controls how much the model changes with each step. For fine-tuning, a smaller learning rate helps the model adjust gently to new data without forgetting what it learned before.

You have a pre-trained model and want to adapt it to a new but related task.

You want to improve a model's performance on a specific dataset without training from scratch.

You want to avoid large changes that could ruin the useful features learned earlier.

You want to save time by training fewer epochs with careful updates.

You want to prevent overfitting by making small, careful adjustments.

Syntax

TensorFlow

optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

The learning_rate value is usually smaller for fine-tuning than for training from scratch.

You can use different optimizers like Adam, SGD, or RMSprop with a custom learning rate.

Examples

Using a very small learning rate for sensitive fine-tuning on a classification task.

TensorFlow

optimizer = tf.keras.optimizers.Adam(learning_rate=0.00001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

Using SGD optimizer with momentum and a small learning rate for fine-tuning.

TensorFlow

optimizer = tf.keras.optimizers.SGD(learning_rate=0.0001, momentum=0.9)
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Sample Model

This example shows how to load a pre-trained model, freeze it, add a new head, and train with a small learning rate. Then it unfreezes the base model and fine-tunes with an even smaller learning rate.

TensorFlow

import tensorflow as tf

# Load a pre-trained MobileNetV2 model without the top layer
base_model = tf.keras.applications.MobileNetV2(input_shape=(96, 96, 3), include_top=False, weights='imagenet')
base_model.trainable = False  # Freeze base model

# Add new classification head
inputs = tf.keras.Input(shape=(96, 96, 3))
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(10)(x)  # 10 classes
outputs = x
model = tf.keras.Model(inputs, outputs)

# Compile with a small learning rate for fine-tuning
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Create dummy data
import numpy as np
x_train = np.random.random((20, 96, 96, 3))
y_train = np.random.randint(0, 10, 20)

# Train the model
history = model.fit(x_train, y_train, epochs=2, batch_size=5, verbose=2)

# Unfreeze base model for fine-tuning
base_model.trainable = True

# Recompile with even smaller learning rate
optimizer_finetune = tf.keras.optimizers.Adam(learning_rate=0.00001)
model.compile(optimizer=optimizer_finetune, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Continue training
history_finetune = model.fit(x_train, y_train, epochs=2, batch_size=5, verbose=2)

OutputSuccess

Important Notes

Start with a small learning rate to avoid destroying the pre-trained features.

After initial training, unfreeze some layers and use an even smaller learning rate for fine-tuning.

Monitor training metrics to check if the learning rate is too high or too low.

Summary

Use a smaller learning rate when fine-tuning to make gentle updates.

Freeze the base model first, then unfreeze and fine-tune with a smaller learning rate.

Choose learning rates carefully to balance learning speed and stability.