CosineAnnealingLR helps the learning rate go down smoothly like a wave. This helps the model learn better by not changing too fast or too slow.
0
0
CosineAnnealingLR in PyTorch
Introduction
When training a neural network and you want the learning rate to slowly decrease and then restart.
When you want to avoid sudden drops in learning rate that can confuse the model.
When you want to improve training stability by adjusting the learning rate in a smooth pattern.
When you want to try a learning rate schedule that can help the model escape local mistakes.
When you want to experiment with cyclical learning rates that reset after some steps.
Syntax
PyTorch
torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=-1, verbose=False)
optimizer: The optimizer whose learning rate you want to adjust.
T_max: Number of epochs or steps for one full cosine cycle.
Examples
Learning rate will decrease following a cosine curve over 10 steps.
PyTorch
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)Learning rate decreases to a minimum of 0.001 over 20 steps.
PyTorch
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=20, eta_min=0.001)
Sample Model
This code shows how the learning rate changes over 10 steps using CosineAnnealingLR with a cycle of 5 steps. The learning rate starts at 0.1 and decreases smoothly to 0.01, then restarts.
PyTorch
import torch import torch.nn as nn import torch.optim as optim # Simple model model = nn.Linear(2, 1) # Optimizer optimizer = optim.SGD(model.parameters(), lr=0.1) # Scheduler with T_max=5 steps scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=5, eta_min=0.01) print('Step | Learning Rate') for step in range(10): # Dummy training step optimizer.zero_grad() dummy_input = torch.tensor([[1.0, 2.0]]) output = model(dummy_input) loss = output.sum() loss.backward() optimizer.step() # Step the scheduler scheduler.step() # Print current learning rate lr = optimizer.param_groups[0]['lr'] print(f'{step+1:4d} | {lr:.5f}')
OutputSuccess
Important Notes
The learning rate follows a cosine curve from the initial value down to eta_min.
After T_max steps, the learning rate restarts the cycle.
Use scheduler.step() after each optimizer step to update the learning rate.
Summary
CosineAnnealingLR smoothly changes the learning rate like a wave.
It helps training by avoiding sudden learning rate changes.
Use it by setting T_max for cycle length and optionally eta_min for minimum learning rate.