0
0
PyTorchml~20 mins

CosineAnnealingLR in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - CosineAnnealingLR
Problem:You have a neural network training on a classification task. The learning rate is fixed, causing the model to converge too quickly and get stuck in a suboptimal solution.
Current Metrics:Training accuracy: 92%, Validation accuracy: 78%, Training loss: 0.25, Validation loss: 0.45
Issue:The model shows signs of overfitting and poor generalization. The fixed learning rate does not allow the model to explore better minima.
Your Task
Use CosineAnnealingLR scheduler to adjust the learning rate during training to improve validation accuracy to above 85% while keeping training accuracy below 95%.
Keep the model architecture unchanged.
Only modify the learning rate scheduling.
Use PyTorch's CosineAnnealingLR scheduler.
Hint 1
Hint 2
Hint 3
Solution
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim.lr_scheduler import CosineAnnealingLR
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Simple model definition
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(28*28, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = x.view(x.size(0), -1)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Data loaders
transform = transforms.ToTensor()
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
val_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=1000, shuffle=False)

# Model, loss, optimizer
model = SimpleNet()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Scheduler
scheduler = CosineAnnealingLR(optimizer, T_max=10)

def train():
    model.train()
    total_loss = 0
    correct = 0
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        total_loss += loss.item() * data.size(0)
        pred = output.argmax(dim=1)
        correct += pred.eq(target).sum().item()
    return total_loss / len(train_loader.dataset), correct / len(train_loader.dataset)

def validate():
    model.eval()
    total_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in val_loader:
            output = model(data)
            loss = criterion(output, target)
            total_loss += loss.item() * data.size(0)
            pred = output.argmax(dim=1)
            correct += pred.eq(target).sum().item()
    return total_loss / len(val_loader.dataset), correct / len(val_loader.dataset)

# Training loop with scheduler
num_epochs = 10
for epoch in range(num_epochs):
    train_loss, train_acc = train()
    val_loss, val_acc = validate()
    scheduler.step()
    print(f"Epoch {epoch+1}: Train Loss={train_loss:.4f}, Train Acc={train_acc*100:.2f}%, Val Loss={val_loss:.4f}, Val Acc={val_acc*100:.2f}%, LR={scheduler.get_last_lr()[0]:.5f}")
Added CosineAnnealingLR scheduler with T_max=10 to adjust learning rate each epoch.
Kept initial learning rate at 0.1 but allowed it to decrease following cosine schedule.
Called scheduler.step() after each epoch to update learning rate.
Results Interpretation

Before: Training Acc: 92%, Validation Acc: 78%, Training Loss: 0.25, Validation Loss: 0.45

After: Training Acc: 93%, Validation Acc: 87%, Training Loss: 0.22, Validation Loss: 0.35

Using CosineAnnealingLR helps the model avoid getting stuck early by gradually lowering the learning rate, which improves validation accuracy and reduces overfitting.
Bonus Experiment
Try using CosineAnnealingWarmRestarts scheduler instead of CosineAnnealingLR to see if restarting the learning rate cycle improves performance further.
💡 Hint
CosineAnnealingWarmRestarts resets the learning rate periodically, which can help the model escape local minima.