Challenge - 5 Problems

🎖️

DataParallel Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

Why use DataParallel in PyTorch?

You have a deep learning model and a machine with multiple GPUs. What is the main reason to use DataParallel in PyTorch?

ATo reduce the size of the model so it fits into a single GPU memory.

BTo split the input data across multiple GPUs and run the model in parallel to speed up training.

CTo automatically tune hyperparameters during training.

DTo convert the model into a CPU-only version for faster inference.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of DataParallel model forward pass

Consider this PyTorch code snippet using DataParallel:

PyTorch

import torch
import torch.nn as nn

class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(2, 1)
    def forward(self, x):
        return self.linear(x)

model = SimpleModel()
model = nn.DataParallel(model)
input_tensor = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
output = model(input_tensor)
print(output.shape)

ARuntimeError due to input on CPU but model on GPU

Btorch.Size([1, 2])

Ctorch.Size([2, 1])

Dtorch.Size([4, 1])

Attempts:

2 left

❓ Hyperparameter

advanced

1:30remaining

Choosing batch size with DataParallel

You want to train a model using DataParallel on 4 GPUs. Your original batch size is 64. What is the best way to set the batch size when using DataParallel?

AKeep batch size 64; DataParallel will split it automatically across GPUs.

BSet batch size to 16 because DataParallel requires batch size per GPU.

CIncrease batch size to 256 to fully use all GPUs.

DSet batch size to 1 because DataParallel only supports batch size 1.

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Why does this DataParallel code raise an error?

Look at this code snippet:

PyTorch

import torch
import torch.nn as nn

model = nn.Linear(10, 5)
model = nn.DataParallel(model)
input_tensor = torch.randn(3, 10).cuda()
output = model(input_tensor)
print(output)

ARaises AttributeError because DataParallel is not imported.

BRuns fine and prints output tensor.

CRaises TypeError because input tensor shape is wrong.

DRaises RuntimeError because model is on CPU but input is on GPU.

Attempts:

2 left

❓ Model Choice

expert

2:30remaining

Best model wrapping for multi-GPU training

You have a complex model with custom layers and want to train on multiple GPUs. Which approach is best to ensure correct gradient updates and efficient training?

AUse nn.parallel.DistributedDataParallel with proper setup for multi-GPU training.

BTrain the model on CPU and manually split data across GPUs.

CWrap the model with nn.DataParallel before moving it to GPU.

DWrap the model with nn.DataParallel after moving it to GPU.

Attempts:

2 left