Learning rate schedulers help the model learn better by changing the speed of learning during training. This can make training faster and improve results.
Learning rate schedulers in PyTorch
Start learning this pattern below
Jump into concepts and practice - no test required
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1) for epoch in range(num_epochs): train(...) # your training code scheduler.step()
optimizer is the optimizer you use for training, like Adam or SGD.
step_size is how many epochs before the learning rate changes.
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.5)
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)This code trains a simple linear model on dummy data. The learning rate starts at 0.1 and drops by 10 times every 3 epochs. The print shows loss and current learning rate each epoch.
import torch import torch.nn as nn import torch.optim as optim # Simple model model = nn.Linear(2, 1) # Optimizer optimizer = optim.SGD(model.parameters(), lr=0.1) # Scheduler: reduce LR by 0.1 every 3 epochs scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1) # Dummy data inputs = torch.tensor([[1.0, 2.0], [3.0, 4.0]]) targets = torch.tensor([[1.0], [2.0]]) loss_fn = nn.MSELoss() for epoch in range(6): optimizer.zero_grad() outputs = model(inputs) loss = loss_fn(outputs, targets) loss.backward() optimizer.step() scheduler.step() print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}, LR: {scheduler.get_last_lr()[0]:.5f}")
Call scheduler.step() after each epoch to update the learning rate.
You can check the current learning rate with scheduler.get_last_lr().
Different schedulers change the learning rate in different ways; choose one that fits your training needs.
Learning rate schedulers adjust the learning speed during training to help the model learn better.
They are used to reduce the learning rate gradually or at specific times.
PyTorch provides many schedulers like StepLR, ExponentialLR, and CosineAnnealingLR.
Practice
Solution
Step 1: Understand the role of learning rate
The learning rate controls how fast the model updates its knowledge during training.Step 2: Identify what a scheduler does
A learning rate scheduler changes the learning rate over time to improve training stability and performance.Final Answer:
To adjust the learning rate during training for better model performance -> Option DQuick Check:
Learning rate scheduler adjusts learning rate [OK]
- Confusing scheduler with batch size adjustment
- Thinking scheduler changes model layers
- Assuming scheduler shuffles data
opt with step size 10 and gamma 0.1?Solution
Step 1: Recall PyTorch StepLR syntax
The correct class is torch.optim.lr_scheduler.StepLR with parameters step_size and gamma.Step 2: Match parameters correctly
step_size=10 and gamma=0.1 are the correct parameter names and values.Final Answer:
scheduler = torch.optim.lr_scheduler.StepLR(opt, step_size=10, gamma=0.1) -> Option AQuick Check:
StepLR uses step_size and gamma [OK]
- Using wrong parameter names like step or decay
- Calling StepLR from wrong module
- Mixing up parameter order
scheduler.step()?import torch
opt = torch.optim.SGD([torch.nn.Parameter(torch.randn(2, 2, requires_grad=True))], lr=0.1)
scheduler = torch.optim.lr_scheduler.StepLR(opt, step_size=2, gamma=0.5)
for _ in range(3):
scheduler.step()
current_lr = opt.param_groups[0]['lr']
Solution
Step 1: Understand StepLR behavior
StepLR reduces learning rate by gamma every step_size epochs. Here, step_size=2, gamma=0.5.Step 2: Calculate learning rate after 3 steps
After 1 step: lr=0.1 (no change, step 1 < 2)
After 2 steps: lr=0.1 * 0.5 = 0.05 (step 2 reached)
After 3 steps: lr remains 0.05 (step 3 < 4)Final Answer:
0.05 -> Option AQuick Check:
StepLR halves lr every 2 steps [OK]
- Reducing learning rate every step instead of every step_size
- Multiplying gamma incorrectly
- Ignoring initial learning rate
import torch
opt = torch.optim.Adam([torch.nn.Parameter(torch.randn(3, 3, requires_grad=True))], lr=0.01)
scheduler = torch.optim.lr_scheduler.ExponentialLR(opt, gamma=0.9)
for epoch in range(5):
scheduler.step()
print(f"Epoch {epoch}: lr = {opt.param_groups[0]['lr']}")Solution
Step 1: Recall correct scheduler usage
In PyTorch, scheduler.step() should be called after optimizer.step() to update learning rate correctly.Step 2: Check code order
The code calls scheduler.step() before any optimizer.step(), which is incorrect and may cause unexpected lr updates.Final Answer:
scheduler.step() should be called after optimizer.step() -> Option BQuick Check:
Call scheduler.step() after optimizer.step() [OK]
- Calling scheduler.step() before optimizer.step()
- Using invalid gamma values
- Misunderstanding scheduler existence
Solution
Step 1: Understand the two-phase learning rate schedule
First phase: reduce lr by half every 5 epochs for 20 epochs.
Second phase: after 20 epochs, apply exponential decay by 0.9 every epoch.Step 2: Match PyTorch schedulers to phases
StepLR with step_size=5, gamma=0.5 fits first phase.
ExponentialLR with gamma=0.9 fits second phase.
Switching schedulers after 20 epochs achieves desired behavior.Final Answer:
Use StepLR with step_size=5, gamma=0.5 for first 20 epochs, then switch to ExponentialLR with gamma=0.9 -> Option CQuick Check:
Combine StepLR then ExponentialLR for phased decay [OK]
- Trying to use one scheduler for both phases
- Ignoring the switch at epoch 20
- Using wrong scheduler types for phases
