0
0
PyTorchml~20 mins

Checkpoint with optimizer state in PyTorch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Checkpoint Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output of this PyTorch checkpoint loading code?
Consider the following PyTorch code that saves and loads a model checkpoint including the optimizer state. What will be printed after loading?
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim

model = nn.Linear(2, 1)
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Simulate one optimizer step
optimizer.zero_grad()
output = model(torch.tensor([[1.0, 2.0]]))
loss = output.sum()
loss.backward()
optimizer.step()

# Save checkpoint
checkpoint = {'model_state': model.state_dict(), 'optimizer_state': optimizer.state_dict()}
torch.save(checkpoint, 'checkpoint.pth')

# Create new model and optimizer
model2 = nn.Linear(2, 1)
optimizer2 = optim.SGD(model2.parameters(), lr=0.1)

# Load checkpoint
loaded = torch.load('checkpoint.pth')
model2.load_state_dict(loaded['model_state'])
optimizer2.load_state_dict(loaded['optimizer_state'])

# Check optimizer state keys
print(sorted(optimizer2.state_dict().keys()))
A['param_groups', 'state']
B['model_state', 'optimizer_state']
C['weights', 'biases']
D['learning_rate', 'momentum']
Attempts:
2 left
💡 Hint
Look at what keys are stored inside optimizer.state_dict() in PyTorch.
Model Choice
intermediate
1:30remaining
Which optimizer state is necessary to save for resuming training exactly?
When saving a checkpoint to resume training later without losing optimizer progress, which part of the optimizer must be saved?
AOnly the model's parameters
BOnly the gradients of the model parameters
COnly the optimizer's hyperparameters like learning rate and momentum
DThe optimizer's internal state (momentum buffers, etc.) and parameter groups
Attempts:
2 left
💡 Hint
Think about what the optimizer uses internally to update parameters beyond just hyperparameters.
Hyperparameter
advanced
1:30remaining
What happens if you load an optimizer state with a different learning rate than the current optimizer?
Suppose you saved an optimizer state with learning rate 0.01 but now you create a new optimizer with learning rate 0.001 and load the saved state. What learning rate will the optimizer use after loading?
AIt will raise an error due to mismatch
BIt will keep 0.001 from the new optimizer ignoring the loaded state
CIt will use 0.01 from the loaded state, overriding the new optimizer's setting
DIt will average 0.01 and 0.001 and use 0.0055
Attempts:
2 left
💡 Hint
Loading optimizer state_dict overwrites all parameter groups including learning rates.
🔧 Debug
advanced
2:00remaining
Why does this checkpoint loading code cause a runtime error?
Given this code snippet, why does loading the optimizer state cause a runtime error? ```python model = nn.Linear(3, 2) optimizer = optim.Adam(model.parameters(), lr=0.01) checkpoint = torch.load('checkpoint.pth') model.load_state_dict(checkpoint['model_state']) optimizer.load_state_dict(checkpoint['optimizer_state']) ``` Assume the checkpoint was saved from a model with 2 input features instead of 3.
AThe model state dict keys do not match, causing optimizer load to fail
BThe optimizer state keys do not match because model parameters changed shape
CThe checkpoint file is corrupted
DThe optimizer type must be the same but it is different
Attempts:
2 left
💡 Hint
Check if model parameter shapes match between saved and current model.
🧠 Conceptual
expert
2:30remaining
Why is saving optimizer state important for training with adaptive optimizers?
Adaptive optimizers like Adam keep internal statistics (e.g., running averages of gradients). Why is saving and restoring the optimizer state critical when resuming training with such optimizers?
ABecause internal statistics affect parameter updates and losing them changes training dynamics
BBecause optimizer state controls the learning rate scheduler
CBecause optimizer state contains the training data used so far
DBecause without optimizer state, the model weights cannot be restored
Attempts:
2 left
💡 Hint
Think about what adaptive optimizers use internally to adjust updates.