How to use RMSprop optimizer pytorch

PytorchHow-ToBeginner · 3 min read

How to Use RMSprop Optimizer in PyTorch: Syntax and Example

In PyTorch, use torch.optim.RMSprop by passing your model parameters and optional settings like learning rate. Then call optimizer.step() after computing gradients to update model weights.

📐

Syntax

The RMSprop optimizer in PyTorch is created by calling torch.optim.RMSprop(params, lr=0.01, alpha=0.99, eps=1e-08, weight_decay=0, momentum=0, centered=False).

params: model parameters to optimize (usually model.parameters()).
lr: learning rate, controls step size.
alpha: smoothing constant for moving average of squared gradients.
eps: small number to avoid division by zero.
weight_decay: L2 regularization factor.
momentum: momentum factor to accelerate updates.
centered: if True, uses centered RMSprop variant.

python

optimizer = torch.optim.RMSprop(model.parameters(), lr=0.01, alpha=0.99, eps=1e-08, weight_decay=0, momentum=0, centered=False)

💻

Example

This example shows how to use RMSprop to train a simple linear model on dummy data. It demonstrates creating the optimizer, computing loss, backpropagation, and updating weights.

python

import torch
import torch.nn as nn
import torch.optim as optim

# Simple linear model
class LinearModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1, 1)
    def forward(self, x):
        return self.linear(x)

# Create model and optimizer
model = LinearModel()
optimizer = optim.RMSprop(model.parameters(), lr=0.01)

# Dummy data: y = 2x + 1
x = torch.tensor([[1.0], [2.0], [3.0], [4.0]])
y = torch.tensor([[3.0], [5.0], [7.0], [9.0]])

# Loss function
criterion = nn.MSELoss()

# Training loop for 100 steps
for step in range(100):
    optimizer.zero_grad()          # Clear gradients
    outputs = model(x)             # Forward pass
    loss = criterion(outputs, y)   # Compute loss
    loss.backward()                # Backpropagation
    optimizer.step()               # Update weights

# Print final loss and model parameters
print(f"Final loss: {loss.item():.4f}")
print(f"Learned weight: {model.linear.weight.item():.4f}")
print(f"Learned bias: {model.linear.bias.item():.4f}")

Output

Final loss: 0.0001 Learned weight: 2.0000 Learned bias: 1.0000

⚠️

Common Pitfalls

Forgetting to call optimizer.zero_grad() before loss.backward() causes gradients to accumulate incorrectly.
Using a learning rate that is too high can make training unstable.
Not passing model.parameters() to RMSprop will cause errors or no updates.
Confusing momentum and weight_decay parameters; they serve different purposes.

python

import torch.optim as optim

# Wrong: forgetting zero_grad
optimizer = optim.RMSprop(model.parameters(), lr=0.01)
outputs = model(x)
loss = criterion(outputs, y)
loss.backward()
optimizer.step()  # Gradients accumulate, causing wrong updates

# Right: clear gradients before backward
optimizer.zero_grad()
outputs = model(x)
loss = criterion(outputs, y)
loss.backward()
optimizer.step()

📊

Quick Reference

Remember these key points when using RMSprop in PyTorch:

Initialize with torch.optim.RMSprop(model.parameters(), lr=0.01).
Call optimizer.zero_grad() before loss.backward().
Call optimizer.step() to update weights.
Tune lr and momentum for best results.

✅

Key Takeaways

Use torch.optim.RMSprop with model.parameters() and a suitable learning rate.

Always call optimizer.zero_grad() before loss.backward() to reset gradients.

Call optimizer.step() after backward() to update model weights.

Tune learning rate and momentum for stable and efficient training.

Avoid common mistakes like forgetting zero_grad or passing wrong parameters.