Recall & Review
beginner
What is the purpose of the backward pass in PyTorch?
The backward pass computes gradients of the loss with respect to model parameters. It helps the model learn by showing how to adjust weights to reduce errors.
Click to reveal answer
beginner
What does the method
loss.backward() do in PyTorch?It calculates the gradients of the loss tensor with respect to all tensors that have
requires_grad=True. These gradients are stored in the .grad attribute of each tensor.Click to reveal answer
intermediate
Why do we need to call
optimizer.zero_grad() before loss.backward()?Because PyTorch accumulates gradients by default, calling
optimizer.zero_grad() clears old gradients. This prevents mixing gradients from multiple backward passes.Click to reveal answer
beginner
What happens if you forget to call
loss.backward() during training?No gradients will be computed, so the optimizer cannot update the model weights. The model will not learn or improve.
Click to reveal answer
intermediate
How does PyTorch know which operations to track for gradient computation?
PyTorch builds a computation graph dynamically during the forward pass. It tracks operations on tensors with
requires_grad=True to compute gradients during the backward pass.Click to reveal answer
What does
loss.backward() compute?✗ Incorrect
loss.backward() calculates gradients needed for updating model weights.
Before calling
loss.backward(), why do we call optimizer.zero_grad()?✗ Incorrect
Gradients accumulate by default, so clearing them prevents mixing updates.
If a tensor has
requires_grad=False, what happens during loss.backward()?✗ Incorrect
Only tensors with requires_grad=True get gradients computed.
What is stored in the
.grad attribute after loss.backward()?✗ Incorrect
.grad holds the computed gradients for each parameter.
What does PyTorch use to track operations for gradient calculation?
✗ Incorrect
PyTorch builds a dynamic graph during the forward pass to track operations.
Explain in your own words what happens during the backward pass when you call
loss.backward() in PyTorch.Think about how PyTorch figures out how to change weights to reduce loss.
You got /4 concepts.
Why is it important to call
optimizer.zero_grad() before loss.backward() in a training loop?Consider what happens if gradients from previous steps mix with current ones.
You got /3 concepts.