Bird
Raised Fist0
PyTorchml~12 mins

StepLR and MultiStepLR in PyTorch - Model Pipeline Trace

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Model Pipeline - StepLR and MultiStepLR

This pipeline shows how learning rate schedulers StepLR and MultiStepLR adjust the learning rate during training to help the model learn better and faster.

Data Flow - 5 Stages
1Data Loading
1000 rows x 10 featuresLoad dataset with 10 features per sample1000 rows x 10 features
[[0.5, 1.2, ..., 0.3], [0.1, 0.4, ..., 0.7], ...]
2Model Initialization
1000 rows x 10 featuresInitialize a simple neural network with 10 input nodes and 2 output nodesModel ready for training
Neural network with layers: Input(10) -> Hidden(5) -> Output(2)
3Optimizer Setup
Model parametersSet optimizer with initial learning rate 0.1Optimizer ready
SGD optimizer with lr=0.1
4Learning Rate Scheduler Setup
Optimizer with lr=0.1Apply StepLR or MultiStepLR scheduler to adjust learning rate during trainingScheduler ready to update learning rate
StepLR with step_size=5, gamma=0.5 or MultiStepLR with milestones=[3,7], gamma=0.1
5Training Loop
Training data and modelTrain model for 10 epochs, update learning rate each epoch using schedulerTrained model with updated learning rates
Epoch 1 lr=0.1, Epoch 5 lr=0.05 (StepLR), Epoch 3 lr=0.01 (MultiStepLR)
Training Trace - Epoch by Epoch
Loss
1.0 |*
0.9  | *
0.8  |  *
0.7  |   *
0.6  |    *
0.5  |     *
0.4  |      *
0.3  |       *
     +----------------
      1 2 3 4 5 6 7 8 9 10 Epochs
EpochLoss ↓Accuracy ↑Observation
10.850.60Initial training with learning rate 0.1
20.700.68Loss decreased, accuracy improved
30.600.72Learning rate unchanged for StepLR, decreased for MultiStepLR
40.550.75Model continues to improve
50.500.78StepLR reduces learning rate by gamma=0.5 here
60.450.80Lower learning rate helps fine-tune weights
70.420.82MultiStepLR reduces learning rate at this milestone
80.400.83Training stabilizes with smaller learning rate
90.380.84Model converges further
100.360.85Final epoch with lowest learning rate
Prediction Trace - 3 Layers
Layer 1: Input Layer
Layer 2: Hidden Layer (ReLU)
Layer 3: Output Layer (Softmax)
Model Quiz - 3 Questions
Test your understanding
What does the StepLR scheduler do during training?
AReduces learning rate by a factor after fixed number of epochs
BIncreases learning rate gradually every epoch
CKeeps learning rate constant throughout training
DRandomly changes learning rate each epoch
Key Insight
Learning rate schedulers like StepLR and MultiStepLR help the model train better by reducing the learning rate at strategic points, allowing faster learning early on and finer adjustments later to improve accuracy and reduce loss.

Practice

(1/5)
1. What is the main difference between StepLR and MultiStepLR in PyTorch?
easy
A. StepLR decreases learning rate at fixed intervals; MultiStepLR decreases at specific epochs.
B. StepLR increases learning rate; MultiStepLR decreases learning rate.
C. StepLR changes learning rate randomly; MultiStepLR keeps it constant.
D. StepLR is used only for batch size adjustment; MultiStepLR for learning rate.

Solution

  1. Step 1: Understand StepLR behavior

    StepLR reduces the learning rate by a factor every fixed number of epochs (step size).
  2. Step 2: Understand MultiStepLR behavior

    MultiStepLR reduces the learning rate at specific epochs defined by a list of milestones.
  3. Final Answer:

    StepLR decreases learning rate at fixed intervals; MultiStepLR decreases at specific epochs. -> Option A
  4. Quick Check:

    StepLR fixed steps, MultiStepLR specific milestones [OK]
Hint: StepLR uses fixed steps; MultiStepLR uses milestone epochs [OK]
Common Mistakes:
  • Confusing increase vs decrease of learning rate
  • Thinking StepLR changes learning rate randomly
  • Mixing learning rate with batch size adjustments
2. Which of the following is the correct way to create a StepLR scheduler in PyTorch that reduces learning rate every 5 epochs by a factor of 0.1?
easy
A. scheduler = StepLR(optimizer, step_size=5, gamma=0.1)
B. scheduler = StepLR(optimizer, milestones=[5], gamma=0.1)
C. scheduler = MultiStepLR(optimizer, step_size=5, gamma=0.1)
D. scheduler = MultiStepLR(optimizer, milestones=[5], gamma=0.1)

Solution

  1. Step 1: Recall StepLR parameters

    StepLR takes step_size (int) and gamma (decay factor).
  2. Step 2: Identify correct syntax

    scheduler = StepLR(optimizer, step_size=5, gamma=0.1) uses step_size=5 and gamma=0.1, which matches the requirement.
  3. Final Answer:

    scheduler = StepLR(optimizer, step_size=5, gamma=0.1) -> Option A
  4. Quick Check:

    StepLR uses step_size, not milestones [OK]
Hint: StepLR uses step_size, MultiStepLR uses milestones list [OK]
Common Mistakes:
  • Using milestones parameter with StepLR
  • Confusing MultiStepLR and StepLR syntax
  • Passing step_size as a list
3. Given the following code, what will be the learning rate after epoch 7?
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
scheduler = MultiStepLR(optimizer, milestones=[3, 6], gamma=0.1)
for epoch in range(8):
    scheduler.step()
    print(f"Epoch {epoch}: lr = {optimizer.param_groups[0]['lr']}")
medium
A. 0.01
B. 0.001
C. 0.1
D. 0.0001

Solution

  1. Step 1: Understand milestones and gamma

    Learning rate reduces by factor 0.1 at epochs 3 and 6.
  2. Step 2: Calculate learning rate at epoch 7

    Initial lr=0.1; after epoch 3: 0.1*0.1=0.01; after epoch 6: 0.01*0.1=0.001; so at epoch 7 lr=0.001.
  3. Final Answer:

    0.001 -> Option B
  4. Quick Check:

    Two milestones reduce lr twice: 0.1 -> 0.01 -> 0.001 [OK]
Hint: Multiply lr by gamma at each milestone passed [OK]
Common Mistakes:
  • Forgetting to apply gamma at both milestones
  • Assuming lr changes before first milestone
  • Confusing StepLR with MultiStepLR behavior
4. Identify the error in this code snippet using StepLR:
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
scheduler = StepLR(optimizer, milestones=[10, 20], gamma=0.5)
for epoch in range(25):
    scheduler.step()
    print(optimizer.param_groups[0]['lr'])
medium
A. scheduler.step() must be called after optimizer.step() inside loop.
B. Optimizer Adam cannot be used with StepLR scheduler.
C. StepLR does not accept milestones parameter; use step_size instead.
D. Gamma value must be greater than 1 for StepLR.

Solution

  1. Step 1: Check StepLR parameters

    StepLR expects step_size, not milestones.
  2. Step 2: Identify misuse of milestones

    Passing milestones causes error; correct is step_size=10 for example.
  3. Final Answer:

    StepLR does not accept milestones parameter; use step_size instead. -> Option C
  4. Quick Check:

    StepLR uses step_size, not milestones [OK]
Hint: StepLR uses step_size, not milestones list [OK]
Common Mistakes:
  • Using milestones with StepLR
  • Thinking Adam optimizer is incompatible
  • Misunderstanding gamma parameter range
5. You want to train a model for 30 epochs. You want the learning rate to drop by 0.1 at epochs 10 and 20, and then again every 5 epochs after epoch 20. Which scheduler setup correctly achieves this?
hard
A. Use StepLR with step_size=10 and gamma=0.1
B. Use StepLR with step_size=5 and gamma=0.1
C. Use MultiStepLR with milestones=[10, 20, 25, 30] and gamma=0.1
D. Use MultiStepLR with milestones=[10, 20] and gamma=0.1, then StepLR with step_size=5 after epoch 20

Solution

  1. Step 1: Understand the requirement

    Learning rate drops at epochs 10 and 20, then every 5 epochs after 20 (i.e., 25, 30).
  2. Step 2: Analyze scheduler options

    MultiStepLR can handle fixed milestones (10, 20). StepLR can handle regular steps (every 5 epochs). Combining both after epoch 20 fits the requirement.
  3. Step 3: Evaluate options

    Use MultiStepLR with milestones=[10, 20, 25, 30] and gamma=0.1 misses epochs after 20 beyond 25 and 30; Use StepLR with step_size=5 and gamma=0.1 drops every 5 epochs from start; Use StepLR with step_size=10 and gamma=0.1 drops every 10 epochs only; Use MultiStepLR with milestones=[10, 20] and gamma=0.1, then StepLR with step_size=5 after epoch 20 correctly combines both schedulers.
  4. Final Answer:

    Use MultiStepLR with milestones=[10, 20] and gamma=0.1, then StepLR with step_size=5 after epoch 20 -> Option D
  5. Quick Check:

    Combine MultiStepLR for early milestones + StepLR for regular steps after [OK]
Hint: Combine MultiStepLR for milestones + StepLR for regular steps [OK]
Common Mistakes:
  • Trying to use only one scheduler for mixed schedule
  • Misplacing milestones or step_size values
  • Assuming StepLR can handle irregular milestones