Recall & Review

beginner

What is weight decay (L2 regularization) in machine learning?

Weight decay, also called L2 regularization, is a technique that adds a penalty to large weights in a model to keep them small. This helps the model avoid overfitting by making it simpler and more general.

Click to reveal answer

intermediate

How does weight decay affect the loss function during training?

Weight decay adds a term to the loss function that is proportional to the sum of the squares of the weights. This extra term encourages the model to keep weights small while still fitting the data.

Click to reveal answer

beginner

Show a simple PyTorch example of applying weight decay in an optimizer.

In PyTorch, you can add weight decay by setting the 'weight_decay' parameter in the optimizer. For example: optimizer = torch.optim.SGD(model.parameters(), lr=0.01, weight_decay=0.001)

Click to reveal answer

intermediate

Why is weight decay preferred over manually adding L2 penalty to the loss in PyTorch?

Using the 'weight_decay' parameter in PyTorch optimizers is more efficient and numerically stable because it applies the penalty directly during the weight update step, avoiding extra computation in the loss function.

Click to reveal answer

beginner

What happens if the weight decay value is set too high?

If weight decay is too high, the model weights become very small, which can cause underfitting. The model may not learn enough from the data and perform poorly on both training and test sets.

Click to reveal answer

What does weight decay do to model weights during training?

ARemoves weights completely

BEncourages weights to be smaller

CMakes weights larger

DKeeps weights unchanged

In PyTorch, how do you apply weight decay when creating an optimizer?

ANormalize weights after each epoch

BAdd L2 penalty manually to the loss

CUse a special weight_decay layer

DSet the 'weight_decay' parameter in the optimizer

What is the main goal of using weight decay in training?

AIncrease model complexity

BSpeed up training time

CPrevent overfitting by keeping weights small

DMake the model memorize training data

What could happen if weight decay is set too high?

AModel underfits and performs poorly

BModel overfits the training data

CTraining speed increases drastically

DWeights become very large

Which of these is NOT a benefit of using weight decay?

AGuarantees perfect accuracy

BReduces overfitting

CKeeps model weights small

DImproves model generalization

Explain in your own words what weight decay (L2 regularization) is and why it is useful in training machine learning models.

Describe how to apply weight decay in PyTorch and why it is better to use the optimizer's weight_decay parameter instead of manually adding L2 loss.

Practice

(1/5)

1. What is the main purpose of weight decay (L2 regularization) in training a PyTorch model?

easy

A. To reduce overfitting by penalizing large weights

B. To increase the learning rate automatically

C. To add more layers to the model

D. To speed up the training process

Weight decay (L2 regularization) in PyTorch - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand weight decay concept

Step 2: Connect to overfitting reduction

Final Answer:

Quick Check:

Solution

Step 1: Recall PyTorch optimizer syntax

Step 2: Identify correct parameter name

Final Answer:

Quick Check:

Solution

Step 1: Understand code flow

Step 2: Interpret printed value

Final Answer:

Quick Check:

Solution

Step 1: Recall weight decay behavior in PyTorch

Step 2: Understand overfitting cause

Final Answer:

Quick Check:

Solution

Step 1: Understand selective weight decay

Step 2: Check code correctness

Final Answer:

Quick Check: