0
0
PyTorchml~5 mins

Why regularization controls overfitting in PyTorch

Choose your learning style9 modes available
Introduction

Regularization helps a model avoid memorizing training data too much. It keeps the model simple so it can work well on new data.

When your model performs very well on training data but poorly on new data.
When your model is very complex with many parameters.
When you want to improve your model's ability to generalize.
When training data is limited or noisy.
Syntax
PyTorch
loss = criterion(output, target) + lambda_ * regularization_term

The regularization term adds a penalty to the loss.

Common regularizations are L1 (sum of absolute weights) and L2 (sum of squared weights).

Examples
This adds L2 regularization to the loss to keep weights small.
PyTorch
l2_lambda = 0.01
l2_norm = sum(p.pow(2.0).sum() for p in model.parameters())
loss = criterion(output, target) + l2_lambda * l2_norm
This adds L1 regularization to encourage sparsity in weights.
PyTorch
l1_lambda = 0.005
l1_norm = sum(p.abs().sum() for p in model.parameters())
loss = criterion(output, target) + l1_lambda * l1_norm
Sample Model

This code trains a small neural network on the XOR problem with L2 regularization to prevent overfitting. It prints the final loss and rounded predictions.

PyTorch
import torch
import torch.nn as nn
import torch.optim as optim

# Simple model
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(2, 1)
    def forward(self, x):
        return self.fc(x)

# Data: XOR problem
inputs = torch.tensor([[0,0],[0,1],[1,0],[1,1]], dtype=torch.float32)
targets = torch.tensor([[0],[1],[1],[0]], dtype=torch.float32)

model = SimpleNet()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)
l2_lambda = 0.1

for epoch in range(100):
    optimizer.zero_grad()
    outputs = model(inputs)
    mse_loss = criterion(outputs, targets)
    l2_norm = sum(p.pow(2).sum() for p in model.parameters())
    loss = mse_loss + l2_lambda * l2_norm
    loss.backward()
    optimizer.step()

# Print final loss and predictions
with torch.no_grad():
    preds = model(inputs)
    final_loss = criterion(preds, targets).item()
    print(f"Final MSE Loss: {final_loss:.4f}")
    print("Predictions:")
    print(preds.round())
OutputSuccess
Important Notes

Regularization adds a small penalty to large weights, encouraging simpler models.

Too much regularization can make the model too simple and underfit.

Common regularization methods include L1, L2, and dropout.

Summary

Regularization helps control overfitting by keeping model weights small.

It adds a penalty term to the loss function during training.

This leads to better performance on new, unseen data.