Model Pipeline - Why regularization controls overfitting
This pipeline shows how adding regularization helps a model avoid overfitting by keeping it simple and improving its ability to generalize to new data.
Jump into concepts and practice - no test required
This pipeline shows how adding regularization helps a model avoid overfitting by keeping it simple and improving its ability to generalize to new data.
Loss
1.2 |*
0.9 | **
0.6 | ***
0.3 | ****
--------
Epochs| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | 1.2 | 40% | High loss and low accuracy, model just started learning |
| 5 | 0.6 | 70% | Loss decreased, accuracy improved, model learning well |
| 10 | 0.4 | 80% | Loss continues to decrease, accuracy rises |
| 15 | 0.35 | 83% | Loss stabilizes, accuracy improves slowly |
| 20 | 0.33 | 85% | Model converged with good generalization due to regularization |
weight_decay in optimizers to apply L2 regularization.weight_decay=0.1, which is the correct way to add L2 regularization.optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=0.01)
for data, target in dataloader:
optimizer.zero_grad()
output = model(data)
loss = loss_fn(output, target)
loss.backward()
optimizer.step()
What effect does the weight_decay=0.01 have during training?weight_decay parameter adds L2 regularization, penalizing large weights during training.optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
for data, target in dataloader:
optimizer.zero_grad()
output = model(data)
loss = loss_fn(output, target) + 0.01 * torch.sum(model.parameters())
loss.backward()
optimizer.step()
What is wrong with this code regarding regularization?torch.sum(model.parameters()), which is incorrect for L2 penalty.