Learning rate controls how fast a model learns. Choosing the right learning rate helps the model learn well without getting stuck or jumping around.
Learning rate selection in Computer Vision
Start learning this pattern below
Jump into concepts and practice - no test required
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
The learning rate is set as lr in the optimizer.
Common optimizers include SGD, Adam, and RMSprop.
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)optimizer = torch.optim.Adam(model.parameters(), lr=0.001)scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
This code trains a simple CNN on MNIST for one batch using a learning rate of 0.01. It prints the loss to show training progress.
import torch import torch.nn as nn import torch.optim as optim from torchvision import datasets, transforms # Simple CNN model for MNIST class SimpleCNN(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(1, 10, kernel_size=5) self.pool = nn.MaxPool2d(2) self.fc1 = nn.Linear(10 * 12 * 12, 50) self.fc2 = nn.Linear(50, 10) def forward(self, x): x = self.pool(torch.relu(self.conv1(x))) x = x.view(-1, 10 * 12 * 12) x = torch.relu(self.fc1(x)) x = self.fc2(x) return x # Load MNIST data transform = transforms.Compose([transforms.ToTensor()]) train_dataset = datasets.MNIST('.', train=True, download=True, transform=transform) train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True) # Initialize model, loss, optimizer with learning rate 0.01 model = SimpleCNN() criterion = nn.CrossEntropyLoss() learning_rate = 0.01 optimizer = optim.SGD(model.parameters(), lr=learning_rate) # Train for 1 batch model.train() for batch_idx, (data, target) in enumerate(train_loader): optimizer.zero_grad() output = model(data) loss = criterion(output, target) loss.backward() optimizer.step() if batch_idx == 0: print(f'Batch {batch_idx} Loss: {loss.item():.4f}') break
Too high learning rate can make training unstable.
Too low learning rate makes training slow.
Try learning rate schedules to improve training.
Learning rate controls how fast the model updates.
Choose learning rate carefully for good training.
Use optimizers and schedulers to manage learning rate.
Practice
What does the learning rate control in training a computer vision model?
Solution
Step 1: Understand the role of learning rate
The learning rate determines how much the model changes its weights after seeing each example.Step 2: Connect learning rate to model updates
A higher learning rate means faster updates, while a lower rate means slower updates.Final Answer:
How fast the model updates its knowledge -> Option CQuick Check:
Learning rate controls update speed = C [OK]
- Confusing learning rate with model size
- Thinking learning rate changes input data
- Mixing learning rate with activation functions
Which of the following is the correct way to set a learning rate of 0.01 using PyTorch's SGD optimizer?
import torch.optim as optim
optimizer = optim.SGD(model.parameters(), lr=___)Solution
Step 1: Check the expected type for learning rate
The learning rate parameter expects a float number, not a string or variable name.Step 2: Identify the correct float value for 0.01
Using 0.01 as a float sets the learning rate correctly.Final Answer:
0.01 -> Option AQuick Check:
Learning rate as float = 0.01 [OK]
- Using string "0.01" instead of float 0.01
- Passing undefined variable learning_rate
- Setting lr to 0.1 by mistake
Consider this training loop snippet for a vision model:
learning_rate = 0.5
for epoch in range(3):
loss = train_one_epoch(model, data, learning_rate)
print(f"Epoch {epoch+1} loss: {loss:.2f}")If the learning rate is too high, what is the most likely output behavior?
Solution
Step 1: Understand effect of high learning rate
A very high learning rate like 0.5 can cause the model to overshoot the best weights, making training unstable.Step 2: Predict loss behavior with unstable training
Loss will not steadily decrease but will jump up and down or increase.Final Answer:
Loss fluctuates or increases wildly -> Option DQuick Check:
High lr causes unstable loss = A [OK]
- Assuming loss always decreases regardless of lr
- Thinking loss becomes zero immediately
- Confusing constant loss with stable training
Given this code snippet, identify the error related to learning rate usage:
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
for epoch in range(5):
loss = train(model, data)
optimizer.step()
optimizer.zero_grad()Solution
Step 1: Check optimizer usage order
Before calling optimizer.step(), gradients must be computed by loss.backward().Step 2: Identify missing backward call
The code misses loss.backward(), so optimizer.step() updates without gradients.Final Answer:
optimizer.step() called before loss.backward() -> Option AQuick Check:
Missing loss.backward() before step = B [OK]
- Thinking learning rate 0.001 is too high for Adam
- Believing zero_grad() order is wrong here
- Assuming learning rate must change each epoch
You want to train a deep vision model on a new dataset. You start with a learning rate of 0.1 but notice training loss does not decrease. What is the best next step?
Solution
Step 1: Analyze why loss does not decrease
A high learning rate like 0.1 can cause the model to skip the best weights, preventing loss decrease.Step 2: Choose a safer learning rate adjustment
Lowering the learning rate to 0.01 allows smaller, stable updates to improve training.Final Answer:
Decrease the learning rate to 0.01 and try again -> Option BQuick Check:
Lower lr if loss stuck = D [OK]
- Increasing learning rate when training fails
- Ignoring learning rate and training longer
- Removing learning rate parameter entirely
