PyTorchml~20 mins

Weight decay (L2 regularization) in PyTorch - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Challenge - 5 Problems

🎖️

Weight Decay Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

What is the main purpose of weight decay (L2 regularization) in training neural networks?

Weight decay is often used during training of neural networks. What does it mainly help with?

AIt speeds up the training by increasing the learning rate automatically.

BIt helps the model memorize the training data perfectly.

CIt prevents the model from overfitting by penalizing large weights.

DIt changes the activation functions to nonlinear ones.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

What is the output of this PyTorch training loop snippet with weight decay?

Consider this PyTorch code snippet training a simple linear model with weight decay. What will be printed?

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim

model = nn.Linear(1, 1)
optimizer = optim.SGD(model.parameters(), lr=0.1, weight_decay=0.1)

x = torch.tensor([[1.0]])
y = torch.tensor([[2.0]])

criterion = nn.MSELoss()

for _ in range(1):
    optimizer.zero_grad()
    output = model(x)
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()

print(round(loss.item(), 3))

A0.0

B0.25

C0.81

D1.0

Attempts:

2 left

❓ Model Choice

advanced

2:30remaining

Which optimizer setup correctly applies weight decay only to weights but not biases in PyTorch?

You want to apply weight decay only to the weights of a neural network, not to the bias terms. Which optimizer setup below does this correctly?

optimizer = optim.Adam([
    {'params': [p for n, p in model.named_parameters() if 'bias' not in n], 'weight_decay': 0.01},
    {'params': [p for n, p in model.named_parameters() if 'bias' in n], 'weight_decay': 0.0}
], lr=0.001)

Boptimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=0.01)

Coptimizer = optim.Adam(model.parameters(), lr=0.001)

optimizer = optim.Adam([
    {'params': model.parameters(), 'weight_decay': 0.0}
], lr=0.001)

Attempts:

2 left

❓ Hyperparameter

advanced

1:30remaining

What is the effect of increasing the weight decay hyperparameter too much during training?

If you increase the weight decay value excessively, what is the most likely effect on the model's training?

AThe model will ignore the training data and memorize the validation set.

BThe model weights will become very large and unstable.

CThe model will train faster and achieve higher accuracy.

DThe model weights will shrink too much, causing underfitting and poor performance.

Attempts:

2 left

🔧 Debug

expert

2:30remaining

Why does this PyTorch training code not apply weight decay as expected?

Look at this PyTorch code snippet. The user expects weight decay to be applied, but the model's weights do not shrink. What is the bug?

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim

model = nn.Linear(2, 1)
optimizer = optim.SGD(model.parameters(), lr=0.1, weight_decay=0.01)

x = torch.tensor([[1.0, 2.0]])
y = torch.tensor([[1.0]])

criterion = nn.MSELoss()

for _ in range(10):
    optimizer.zero_grad()
    output = model(x)
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()

print(model.weight)

AWeight decay is not applied because the model's bias parameters are updated but weights are not.

BWeight decay is not applied because the model's weights are frozen by default.

CWeight decay is not applied because the optimizer is created before model parameters are initialized.

DThe weight_decay parameter is ignored because the learning rate is too high.

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of weight decay (L2 regularization) in training a PyTorch model?

easy

A. To reduce overfitting by penalizing large weights

B. To increase the learning rate automatically

C. To add more layers to the model

D. To speed up the training process

Weight decay (L2 regularization) in PyTorch - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand weight decay concept

Step 2: Connect to overfitting reduction

Final Answer:

Quick Check:

Solution

Step 1: Recall PyTorch optimizer syntax

Step 2: Identify correct parameter name

Final Answer:

Quick Check:

Solution

Step 1: Understand code flow

Step 2: Interpret printed value

Final Answer:

Quick Check:

Solution

Step 1: Recall weight decay behavior in PyTorch

Step 2: Understand overfitting cause

Final Answer:

Quick Check:

Solution

Step 1: Understand selective weight decay

Step 2: Check code correctness

Final Answer:

Quick Check: