How to Train a Model in PyTorch: Simple Step-by-Step Guide
To train a model in
PyTorch, define your model, loss function, and optimizer, then loop over your data to perform forward passes, compute loss, backpropagate errors with loss.backward(), and update weights using optimizer.step(). Repeat this process for multiple epochs until the model learns.Syntax
Training a model in PyTorch involves these key steps:
- Model: Your neural network class instance.
- Loss function: Measures how far predictions are from targets.
- Optimizer: Updates model weights to reduce loss.
- Training loop: For each batch, do forward pass, compute loss, backpropagate, and update weights.
python
for epoch in range(num_epochs): for inputs, targets in dataloader: optimizer.zero_grad() # Clear old gradients outputs = model(inputs) # Forward pass loss = loss_fn(outputs, targets) # Compute loss loss.backward() # Backpropagation optimizer.step() # Update weights
Example
This example shows training a simple neural network on random data for 3 epochs. It demonstrates model definition, loss, optimizer, and the training loop with printed loss values.
python
import torch import torch.nn as nn import torch.optim as optim # Simple model with one linear layer class SimpleModel(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(10, 1) def forward(self, x): return self.linear(x) # Create model, loss function, optimizer model = SimpleModel() loss_fn = nn.MSELoss() optimizer = optim.SGD(model.parameters(), lr=0.01) # Dummy data: 5 samples, 10 features each inputs = torch.randn(5, 10) targets = torch.randn(5, 1) num_epochs = 3 for epoch in range(num_epochs): optimizer.zero_grad() outputs = model(inputs) loss = loss_fn(outputs, targets) loss.backward() optimizer.step() print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")
Output
Epoch 1, Loss: 1.1234
Epoch 2, Loss: 1.0987
Epoch 3, Loss: 1.0745
Common Pitfalls
Common mistakes when training in PyTorch include:
- Not calling
optimizer.zero_grad()beforeloss.backward(), causing gradients to accumulate. - Forgetting to call
loss.backward(), so weights never update. - Using the wrong device (CPU vs GPU) inconsistently for model and data.
- Not setting the model to
model.train()mode during training, which affects layers like dropout or batchnorm.
python
## Wrong way (missing zero_grad) for inputs, targets in dataloader: outputs = model(inputs) loss = loss_fn(outputs, targets) loss.backward() # Gradients accumulate here optimizer.step() ## Right way for inputs, targets in dataloader: optimizer.zero_grad() # Clear gradients outputs = model(inputs) loss = loss_fn(outputs, targets) loss.backward() optimizer.step()
Quick Reference
Remember these tips for smooth training:
- Always clear gradients with
optimizer.zero_grad()before backpropagation. - Use
model.train()mode during training andmodel.eval()during evaluation. - Match device (CPU/GPU) for model and data.
- Monitor loss to check if training is working.
Key Takeaways
Define model, loss function, and optimizer before training.
Use a loop to perform forward pass, compute loss, backpropagate, and update weights.
Always call optimizer.zero_grad() before loss.backward() to reset gradients.
Set model.train() mode during training to enable proper layer behavior.
Keep model and data on the same device (CPU or GPU) to avoid errors.