0
0
PyTorchml~5 mins

CosineAnnealingLR in PyTorch

Choose your learning style9 modes available
Introduction

CosineAnnealingLR helps the learning rate go down smoothly like a wave. This helps the model learn better by not changing too fast or too slow.

When training a neural network and you want the learning rate to slowly decrease and then restart.
When you want to avoid sudden drops in learning rate that can confuse the model.
When you want to improve training stability by adjusting the learning rate in a smooth pattern.
When you want to try a learning rate schedule that can help the model escape local mistakes.
When you want to experiment with cyclical learning rates that reset after some steps.
Syntax
PyTorch
torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=-1, verbose=False)

optimizer: The optimizer whose learning rate you want to adjust.

T_max: Number of epochs or steps for one full cosine cycle.

Examples
Learning rate will decrease following a cosine curve over 10 steps.
PyTorch
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)
Learning rate decreases to a minimum of 0.001 over 20 steps.
PyTorch
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=20, eta_min=0.001)
Sample Model

This code shows how the learning rate changes over 10 steps using CosineAnnealingLR with a cycle of 5 steps. The learning rate starts at 0.1 and decreases smoothly to 0.01, then restarts.

PyTorch
import torch
import torch.nn as nn
import torch.optim as optim

# Simple model
model = nn.Linear(2, 1)

# Optimizer
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Scheduler with T_max=5 steps
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=5, eta_min=0.01)

print('Step | Learning Rate')
for step in range(10):
    # Dummy training step
    optimizer.zero_grad()
    dummy_input = torch.tensor([[1.0, 2.0]])
    output = model(dummy_input)
    loss = output.sum()
    loss.backward()
    optimizer.step()

    # Step the scheduler
    scheduler.step()

    # Print current learning rate
    lr = optimizer.param_groups[0]['lr']
    print(f'{step+1:4d} | {lr:.5f}')
OutputSuccess
Important Notes

The learning rate follows a cosine curve from the initial value down to eta_min.

After T_max steps, the learning rate restarts the cycle.

Use scheduler.step() after each optimizer step to update the learning rate.

Summary

CosineAnnealingLR smoothly changes the learning rate like a wave.

It helps training by avoiding sudden learning rate changes.

Use it by setting T_max for cycle length and optionally eta_min for minimum learning rate.