Imagine you are teaching a robot to recognize cats in photos. The robot guesses, but sometimes it is wrong. How does backpropagation help the robot improve its guesses?
Think about how the robot learns from mistakes by adjusting what it knows.
Backpropagation calculates the error between the robot's guess and the correct answer, then adjusts the internal settings (weights) to reduce this error in future guesses.
Given a simple neural network with one weight w and loss function L = (w * x - y)^2, where x=2 and y=4, what is the gradient of L with respect to w when w=1?
x = 2 y = 4 w = 1 L = (w * x - y)**2 # Calculate dL/dw dL_dw = 2 * (w * x - y) * x print(dL_dw)
Use the chain rule: dL/dw = 2 * (w*x - y) * x
Calculate error: (1*2 - 4) = -2. Then gradient: 2 * (-2) * 2 = -8.
Backpropagation through time is a special version of backpropagation used for certain models. Which model type uses BPTT to learn from sequences?
Think about models that process data step-by-step over time.
BPTT is used to train RNNs because they handle sequences and need to propagate errors back through time steps.
When training a neural network, which setting decides how big each step is when adjusting weights after calculating gradients?
It is a small number that multiplies the gradient to update weights.
The learning rate scales the gradient during weight updates, controlling how fast or slow the model learns.
You trained a model and see that training loss keeps decreasing but validation loss starts increasing. Which metric behavior shows overfitting?
Overfitting means the model learns training data too well but fails on new data.
When training accuracy improves but validation accuracy worsens, the model is overfitting.