0
0
PyTorchml~5 mins

Weight decay (L2 regularization) in PyTorch - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is weight decay (L2 regularization) in machine learning?
Weight decay, also called L2 regularization, is a technique that adds a penalty to large weights in a model to keep them small. This helps the model avoid overfitting by making it simpler and more general.
Click to reveal answer
intermediate
How does weight decay affect the loss function during training?
Weight decay adds a term to the loss function that is proportional to the sum of the squares of the weights. This extra term encourages the model to keep weights small while still fitting the data.
Click to reveal answer
beginner
Show a simple PyTorch example of applying weight decay in an optimizer.
In PyTorch, you can add weight decay by setting the 'weight_decay' parameter in the optimizer. For example: optimizer = torch.optim.SGD(model.parameters(), lr=0.01, weight_decay=0.001)
Click to reveal answer
intermediate
Why is weight decay preferred over manually adding L2 penalty to the loss in PyTorch?
Using the 'weight_decay' parameter in PyTorch optimizers is more efficient and numerically stable because it applies the penalty directly during the weight update step, avoiding extra computation in the loss function.
Click to reveal answer
beginner
What happens if the weight decay value is set too high?
If weight decay is too high, the model weights become very small, which can cause underfitting. The model may not learn enough from the data and perform poorly on both training and test sets.
Click to reveal answer
What does weight decay do to model weights during training?
ARemoves weights completely
BEncourages weights to be smaller
CMakes weights larger
DKeeps weights unchanged
In PyTorch, how do you apply weight decay when creating an optimizer?
ANormalize weights after each epoch
BAdd L2 penalty manually to the loss
CUse a special weight_decay layer
DSet the 'weight_decay' parameter in the optimizer
What is the main goal of using weight decay in training?
AIncrease model complexity
BSpeed up training time
CPrevent overfitting by keeping weights small
DMake the model memorize training data
What could happen if weight decay is set too high?
AModel underfits and performs poorly
BModel overfits the training data
CTraining speed increases drastically
DWeights become very large
Which of these is NOT a benefit of using weight decay?
AGuarantees perfect accuracy
BReduces overfitting
CKeeps model weights small
DImproves model generalization
Explain in your own words what weight decay (L2 regularization) is and why it is useful in training machine learning models.
Think about how adding a small cost to big weights helps the model not memorize training data.
You got /4 concepts.
    Describe how to apply weight decay in PyTorch and why it is better to use the optimizer's weight_decay parameter instead of manually adding L2 loss.
    Remember the PyTorch optimizer options and how they handle regularization internally.
    You got /4 concepts.