Challenge - 5 Problems

🎖️

AMP Mastery Badge

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Why use mixed precision training (AMP)?

What is the main benefit of using Automatic Mixed Precision (AMP) in PyTorch training?

AIt increases model accuracy by using higher precision calculations everywhere.

BIt reduces memory usage and speeds up training by using float16 where possible.

CIt automatically tunes hyperparameters during training.

DIt converts the model to run on CPU instead of GPU.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of AMP training step snippet

What will be the printed loss value type after this AMP training step?

PyTorch

import torch
from torch.cuda.amp import autocast, GradScaler

model = torch.nn.Linear(2, 1).cuda()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
scaler = GradScaler()

inputs = torch.tensor([[1.0, 2.0]], device='cuda')
target = torch.tensor([[1.0]], device='cuda')

optimizer.zero_grad()
with autocast():
    output = model(inputs)
    loss = torch.nn.functional.mse_loss(output, target)
print(type(loss))

A<class 'torch.Tensor'> with dtype=torch.float32

B<class 'torch.Tensor'> with dtype=torch.float16

C<class 'torch.Tensor'> with dtype=torch.float64

DRuntimeError due to dtype mismatch

Attempts:

2 left

❓ Model Choice

advanced

2:00remaining

Choosing model parts for AMP

Which part of a model should NOT be wrapped inside autocast() for AMP training?

AThe forward pass of the neural network layers

BThe loss calculation function

CThe data loading pipeline

DThe optimizer step call

Attempts:

2 left

❓ Hyperparameter

advanced

2:00remaining

Effect of GradScaler in AMP

What is the role of GradScaler in PyTorch AMP training?

AIt converts all model weights to float16 permanently.

BIt automatically adjusts learning rate during training.

CIt scales the loss to prevent underflow in gradients during backpropagation.

DIt disables gradient computation for certain layers.

Attempts:

2 left

🔧 Debug

expert

3:00remaining

Debugging AMP training error

Given this AMP training snippet, what error will occur and why?

import torch
from torch.cuda.amp import autocast, GradScaler

model = torch.nn.Linear(2, 1).cuda()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
scaler = GradScaler()

inputs = torch.tensor([[1.0, 2.0]], device='cuda')
target = torch.tensor([[1.0]], device='cuda')

optimizer.zero_grad()
with autocast():
    output = model(inputs)
    loss = torch.nn.functional.mse_loss(output, target)
loss.backward()
scaler.step(optimizer)
scaler.update()

ARuntimeError: You must call scaler.scale(loss).backward() instead of loss.backward()

BTypeError: optimizer.step() missing required positional argument

CNo error, code runs successfully

DRuntimeError: autocast context must wrap optimizer.step()

Attempts:

2 left