PyTorch users write their own training loops instead of using a built-in one. Why is this explicit training loop important?
Think about flexibility and control during model training.
PyTorch's explicit training loop gives users full control over each step, enabling easy customization and debugging. It supports automatic differentiation, so gradients are computed automatically, but users decide how to use them.
What will be the printed output after running this PyTorch training loop snippet?
import torch import torch.nn as nn model = nn.Linear(2, 1) optimizer = torch.optim.SGD(model.parameters(), lr=0.1) criterion = nn.MSELoss() inputs = torch.tensor([[1.0, 2.0]]) target = torch.tensor([[1.0]]) optimizer.zero_grad() output = model(inputs) loss = criterion(output, target) loss.backward() optimizer.step() print(round(loss.item(), 4))
Think about the initial random weights and the MSE loss formula.
The initial weights are random, so the output is not equal to the target. The MSE loss is roughly 0.5 for this input and target after one forward pass.
In a PyTorch training loop, which step correctly updates the model parameters after computing gradients?
Which function applies the computed gradients to change model weights?
optimizer.step() applies the gradients to update model parameters. loss.backward() computes gradients, optimizer.zero_grad() clears old gradients, and model.backward() is not a valid method.
What happens if the learning rate in the optimizer is set too high in a PyTorch training loop?
Think about how big steps affect learning stability.
A very high learning rate can cause the model to overshoot the best parameters, making training unstable or diverging instead of converging.
What error will this PyTorch training loop snippet raise?
import torch import torch.nn as nn model = nn.Linear(3, 1) optimizer = torch.optim.Adam(model.parameters(), lr=0.01) criterion = nn.MSELoss() inputs = torch.tensor([[1.0, 2.0]]) # Only 2 features instead of 3 target = torch.tensor([[1.0]]) optimizer.zero_grad() output = model(inputs) loss = criterion(output, target) loss.backward() optimizer.step()
Check the input size vs model expected input size.
The model expects input with 3 features, but the input tensor has only 2 features, causing a size mismatch error during the forward pass.