Learning rate scheduling helps the model learn better by changing the speed of learning during training. It starts fast and slows down to improve accuracy.
0
0
Learning rate scheduling in TensorFlow
Introduction
When training a neural network and you want to improve accuracy over time.
When the model stops improving and you want to try slowing down learning to fine-tune.
When training takes a long time and you want to avoid overshooting the best solution.
When you want to avoid the model jumping around too much in the learning process.
Syntax
TensorFlow
import tensorflow as tf # Example: create a learning rate schedule schedule = tf.keras.optimizers.schedules.ExponentialDecay( initial_learning_rate=0.1, decay_steps=10000, decay_rate=0.96, staircase=True ) # Use schedule in optimizer optimizer = tf.keras.optimizers.SGD(learning_rate=schedule)
The learning rate schedule changes the learning rate during training automatically.
Common schedules include exponential decay, step decay, and cosine decay.
Examples
This schedule reduces the learning rate by 10% every 1000 steps smoothly.
TensorFlow
schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate=0.01,
decay_steps=1000,
decay_rate=0.9
)This schedule uses fixed learning rates that change at step 1000 and 2000.
TensorFlow
schedule = tf.keras.optimizers.schedules.PiecewiseConstantDecay(
boundaries=[1000, 2000],
values=[0.1, 0.01, 0.001]
)This schedule reduces the learning rate following a cosine curve over 10000 steps.
TensorFlow
schedule = tf.keras.optimizers.schedules.CosineDecay(
initial_learning_rate=0.1,
decay_steps=10000
)Sample Model
This program trains a simple model to learn y=2x+1 using a learning rate that halves every 2 steps. It shows how the learning rate changes and the prediction after training.
TensorFlow
import tensorflow as tf import numpy as np # Create simple data: y = 2x + 1 x_train = np.array([[1.0], [2.0], [3.0], [4.0]], dtype=np.float32) y_train = np.array([[3.0], [5.0], [7.0], [9.0]], dtype=np.float32) # Define a simple linear model model = tf.keras.Sequential([ tf.keras.layers.Dense(1, input_shape=(1,)) ]) # Define learning rate schedule lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay( initial_learning_rate=0.1, decay_steps=2, decay_rate=0.5, staircase=True ) # Use SGD optimizer with learning rate schedule optimizer = tf.keras.optimizers.SGD(learning_rate=lr_schedule) # Compile model model.compile(optimizer=optimizer, loss='mse') # Train model history = model.fit(x_train, y_train, epochs=5, verbose=2) # Predict predictions = model.predict(np.array([[5.0]])) print(f"Prediction for input 5.0: {predictions[0][0]:.4f}")
OutputSuccess
Important Notes
Learning rate schedules help avoid getting stuck or jumping too much during training.
Choosing the right schedule and parameters can improve model results significantly.
Summary
Learning rate scheduling changes the learning speed during training.
It helps the model learn better and reach higher accuracy.
TensorFlow offers many built-in schedules like exponential and cosine decay.