How to Set Learning Rate in PyTorch: Simple Guide
In PyTorch, you set the learning rate by passing the
lr parameter when creating an optimizer like torch.optim.SGD or torch.optim.Adam. For example, use optimizer = torch.optim.SGD(model.parameters(), lr=0.01) to set a learning rate of 0.01.Syntax
To set the learning rate in PyTorch, you specify the lr argument when creating an optimizer. The optimizer updates model weights during training.
- model.parameters(): Passes the model's parameters to optimize.
- lr: The learning rate, a small positive number controlling step size.
- optimizer type: Common types include
SGD,Adam, andRMSprop.
python
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)Example
This example shows how to create a simple linear model and set the learning rate to 0.1 using the SGD optimizer. It runs one training step and prints the loss.
python
import torch import torch.nn as nn import torch.optim as optim # Simple linear model define_model = nn.Linear(1, 1) # Mean squared error loss loss_fn = nn.MSELoss() # Set learning rate to 0.1 optimizer = optim.SGD(define_model.parameters(), lr=0.1) # Dummy input and target data = torch.tensor([[1.0]]) target = torch.tensor([[2.0]]) # Forward pass output = define_model(data) loss = loss_fn(output, target) print(f'Initial loss: {loss.item():.4f}') # Backward pass and optimize optimizer.zero_grad() loss.backward() optimizer.step() # Check loss after one step output = define_model(data) loss = loss_fn(output, target) print(f'Loss after one step: {loss.item():.4f}')
Output
Initial loss: 1.0000
Loss after one step: 0.8100
Common Pitfalls
Common mistakes when setting learning rate in PyTorch include:
- Setting the learning rate too high, causing training to diverge.
- Forgetting to pass
model.parameters()to the optimizer. - Changing the learning rate without updating the optimizer or using learning rate schedulers incorrectly.
- Not zeroing gradients before backpropagation with
optimizer.zero_grad().
Always start with a small learning rate like 0.01 or 0.001 and adjust based on training behavior.
python
import torch.optim as optim # Wrong: forgetting model.parameters() # optimizer = optim.SGD(lr=0.01) # This will error # Right way: # optimizer = optim.SGD(model.parameters(), lr=0.01)
Quick Reference
Here is a quick cheat sheet for setting learning rates in common PyTorch optimizers:
| Optimizer | Learning Rate Parameter |
|---|---|
| SGD | lr=0.01 |
| Adam | lr=0.001 |
| RMSprop | lr=0.01 |
| Adagrad | lr=0.01 |
Key Takeaways
Set learning rate by passing lr parameter when creating the optimizer.
Always pass model.parameters() to the optimizer to update model weights.
Start with small learning rates like 0.01 or 0.001 to avoid training issues.
Remember to zero gradients before backpropagation with optimizer.zero_grad().
Use learning rate schedulers to adjust learning rate during training if needed.