How to Use RMSprop Optimizer in PyTorch: Syntax and Example
In PyTorch, use
torch.optim.RMSprop by passing your model parameters and optional settings like learning rate. Then call optimizer.step() after computing gradients to update model weights.Syntax
The RMSprop optimizer in PyTorch is created by calling torch.optim.RMSprop(params, lr=0.01, alpha=0.99, eps=1e-08, weight_decay=0, momentum=0, centered=False).
- params: model parameters to optimize (usually
model.parameters()). - lr: learning rate, controls step size.
- alpha: smoothing constant for moving average of squared gradients.
- eps: small number to avoid division by zero.
- weight_decay: L2 regularization factor.
- momentum: momentum factor to accelerate updates.
- centered: if True, uses centered RMSprop variant.
python
optimizer = torch.optim.RMSprop(model.parameters(), lr=0.01, alpha=0.99, eps=1e-08, weight_decay=0, momentum=0, centered=False)
Example
This example shows how to use RMSprop to train a simple linear model on dummy data. It demonstrates creating the optimizer, computing loss, backpropagation, and updating weights.
python
import torch import torch.nn as nn import torch.optim as optim # Simple linear model class LinearModel(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(1, 1) def forward(self, x): return self.linear(x) # Create model and optimizer model = LinearModel() optimizer = optim.RMSprop(model.parameters(), lr=0.01) # Dummy data: y = 2x + 1 x = torch.tensor([[1.0], [2.0], [3.0], [4.0]]) y = torch.tensor([[3.0], [5.0], [7.0], [9.0]]) # Loss function criterion = nn.MSELoss() # Training loop for 100 steps for step in range(100): optimizer.zero_grad() # Clear gradients outputs = model(x) # Forward pass loss = criterion(outputs, y) # Compute loss loss.backward() # Backpropagation optimizer.step() # Update weights # Print final loss and model parameters print(f"Final loss: {loss.item():.4f}") print(f"Learned weight: {model.linear.weight.item():.4f}") print(f"Learned bias: {model.linear.bias.item():.4f}")
Output
Final loss: 0.0001
Learned weight: 2.0000
Learned bias: 1.0000
Common Pitfalls
- Forgetting to call
optimizer.zero_grad()beforeloss.backward()causes gradients to accumulate incorrectly. - Using a learning rate that is too high can make training unstable.
- Not passing
model.parameters()to RMSprop will cause errors or no updates. - Confusing
momentumandweight_decayparameters; they serve different purposes.
python
import torch.optim as optim # Wrong: forgetting zero_grad optimizer = optim.RMSprop(model.parameters(), lr=0.01) outputs = model(x) loss = criterion(outputs, y) loss.backward() optimizer.step() # Gradients accumulate, causing wrong updates # Right: clear gradients before backward optimizer.zero_grad() outputs = model(x) loss = criterion(outputs, y) loss.backward() optimizer.step()
Quick Reference
Remember these key points when using RMSprop in PyTorch:
- Initialize with
torch.optim.RMSprop(model.parameters(), lr=0.01). - Call
optimizer.zero_grad()beforeloss.backward(). - Call
optimizer.step()to update weights. - Tune
lrandmomentumfor best results.
Key Takeaways
Use torch.optim.RMSprop with model.parameters() and a suitable learning rate.
Always call optimizer.zero_grad() before loss.backward() to reset gradients.
Call optimizer.step() after backward() to update model weights.
Tune learning rate and momentum for stable and efficient training.
Avoid common mistakes like forgetting zero_grad or passing wrong parameters.