0
0
PyTorchml~20 mins

Warmup strategies in PyTorch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Warmup Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why use a learning rate warmup in training?
Which of the following best explains the main reason for using a learning rate warmup at the start of training a neural network?
ATo prevent the model from diverging by starting with a very high learning rate
BTo reduce the total training time by skipping early epochs
CTo gradually increase the learning rate to avoid large updates that can destabilize early training
DTo immediately reach the maximum learning rate for faster convergence
Attempts:
2 left
💡 Hint
Think about how sudden large updates affect a new model's weights.
Predict Output
intermediate
2:00remaining
Output of learning rate scheduler with warmup
What will be the learning rate printed at epoch 3 in the following PyTorch code?
PyTorch
import torch
from torch.optim import SGD
from torch.optim.lr_scheduler import LambdaLR

optimizer = SGD([torch.nn.Parameter(torch.randn(2, 2, requires_grad=True))], lr=0.1)
warmup_epochs = 5

def lr_lambda(epoch):
    if epoch < warmup_epochs:
        return (epoch + 1) / warmup_epochs
    else:
        return 1.0

scheduler = LambdaLR(optimizer, lr_lambda=lr_lambda)

for epoch in range(7):
    scheduler.step()
    print(f"Epoch {epoch}: lr = {optimizer.param_groups[0]['lr']}")
AEpoch 3: lr = 0.06
BEpoch 3: lr = 0.04
CEpoch 3: lr = 0.1
DEpoch 3: lr = 0.08
Attempts:
2 left
💡 Hint
Calculate (epoch + 1) / warmup_epochs * base_lr for epoch 3.
Model Choice
advanced
2:00remaining
Choosing warmup strategy for transformer training
You are training a transformer model on a large text dataset. Which warmup strategy is most suitable to stabilize training and improve final accuracy?
ALinear warmup followed by cosine decay
BStep warmup with abrupt jumps in learning rate
CExponential warmup with sudden drop after warmup
DNo warmup, start with a fixed learning rate
Attempts:
2 left
💡 Hint
Consider smooth transitions in learning rate for stable training.
Hyperparameter
advanced
2:00remaining
Determining warmup steps for a training schedule
If you have a total of 100 epochs and want to use a warmup phase that lasts 10% of training, how many warmup steps should you set?
A5 steps
B10 steps
C20 steps
D50 steps
Attempts:
2 left
💡 Hint
Calculate 10% of 100 epochs.
Metrics
expert
2:00remaining
Effect of warmup on training loss curve
During training with and without learning rate warmup, which difference in the training loss curve is expected?
AWith warmup, loss starts higher and decreases smoothly; without warmup, loss fluctuates or spikes early
BWith warmup, loss decreases abruptly; without warmup, loss decreases smoothly
CWith warmup, loss increases steadily; without warmup, loss decreases steadily
DWith warmup, loss remains constant; without warmup, loss decreases steadily
Attempts:
2 left
💡 Hint
Think about stability of updates at the start of training.