0
0
PyTorchml~5 mins

Why learning rate strategy affects convergence in PyTorch

Choose your learning style9 modes available
Introduction

The learning rate controls how much the model changes at each step. Using a good learning rate strategy helps the model learn faster and better without getting stuck or jumping around.

Training a neural network to recognize images
Adjusting model training to avoid slow or unstable learning
Improving accuracy by fine-tuning how the model updates weights
Preventing the model from missing the best solution during training
Syntax
PyTorch
optimizer = torch.optim.SGD(model.parameters(), lr=initial_lr)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=step, gamma=decay)

The optimizer updates model weights using the learning rate.

The scheduler changes the learning rate during training to help convergence.

Examples
Reduces learning rate by half every 10 steps.
PyTorch
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.5)
Multiplies learning rate by 0.9 every step for smooth decay.
PyTorch
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
Sample Model

This code trains a simple model to learn y=2x. The learning rate starts at 0.1 and halves every 5 epochs. You see loss decrease and learning rate change, showing how the strategy affects training.

PyTorch
import torch
import torch.nn as nn
import torch.optim as optim

# Simple model
class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1, 1)
    def forward(self, x):
        return self.linear(x)

model = SimpleModel()

# Data: y = 2x
x = torch.tensor([[1.0], [2.0], [3.0], [4.0]])
y = torch.tensor([[2.0], [4.0], [6.0], [8.0]])

# Optimizer and scheduler
optimizer = optim.SGD(model.parameters(), lr=0.1)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.5)

loss_fn = nn.MSELoss()

for epoch in range(15):
    optimizer.zero_grad()
    outputs = model(x)
    loss = loss_fn(outputs, y)
    loss.backward()
    optimizer.step()
    scheduler.step()
    print(f"Epoch {epoch+1}: Loss={loss.item():.4f}, LR={scheduler.get_last_lr()[0]:.4f}")
OutputSuccess
Important Notes

A learning rate too high can make training jump around and not settle.

A learning rate too low can make training very slow.

Changing the learning rate during training helps balance speed and stability.

Summary

The learning rate controls how big each step is when the model learns.

Using a strategy to change the learning rate helps the model find better answers faster.

Schedulers in PyTorch make it easy to adjust learning rates during training.