beginner

What is mixed precision training in deep learning?

Mixed precision training uses both 16-bit and 32-bit floating point numbers during model training to speed up computation and reduce memory use, while keeping model accuracy.

Click to reveal answer

beginner

What does AMP stand for in PyTorch?

AMP stands for Automatic Mixed Precision. It is a PyTorch feature that automatically manages mixed precision training to make it easier and safer.

Click to reveal answer

beginner

Why use mixed precision training? List two benefits.

1. Faster training because 16-bit operations are quicker on modern GPUs.<br>2. Less GPU memory used, allowing bigger models or larger batches.

Click to reveal answer

intermediate

How does PyTorch AMP help prevent underflow or overflow during training?

PyTorch AMP uses a technique called loss scaling. It multiplies the loss by a big number before backpropagation to keep gradients in a good range, then scales them back down.

Click to reveal answer

intermediate

Show a simple PyTorch code snippet to enable AMP during training.

scaler = torch.cuda.amp.GradScaler() for data, target in dataloader: optimizer.zero_grad() with torch.cuda.amp.autocast(): output = model(data) loss = loss_fn(output, target) scaler.scale(loss).backward() scaler.step(optimizer) scaler.update()

Click to reveal answer

What is the main purpose of using mixed precision training?

AMake training more accurate by using 64-bit floats

BIncrease model size without changing speed

CSpeed up training and reduce memory use

DAvoid using GPUs

In PyTorch AMP, what does the GradScaler do?

AScales the model weights

BScales the loss to prevent gradient underflow

CScales the input data

DScales the learning rate

Which PyTorch context manager is used to enable mixed precision operations?

Atorch.cuda.amp.autocast()

Btorch.cuda.amp.scale()

Ctorch.enable_grad()

Dtorch.no_grad()

What data types are mainly used in mixed precision training?

Afloat16 and float32

Bfloat64 and float128

Cint32 and int64

Dint8 and float8

What happens if you do not use loss scaling in mixed precision training?

ATraining will be faster

BMemory usage will increase

CModel accuracy will always improve

DGradients might become zero due to underflow

Explain how mixed precision training works and why it is useful.

Describe the steps to enable AMP in a PyTorch training loop.