0
0
PyTorchml~20 mins

DataParallel basics in PyTorch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
DataParallel Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Why use DataParallel in PyTorch?
You have a deep learning model and a machine with multiple GPUs. What is the main reason to use DataParallel in PyTorch?
ATo reduce the size of the model so it fits into a single GPU memory.
BTo split the input data across multiple GPUs and run the model in parallel to speed up training.
CTo automatically tune hyperparameters during training.
DTo convert the model into a CPU-only version for faster inference.
Attempts:
2 left
💡 Hint
Think about how multiple GPUs can be used to handle more data at once.
Predict Output
intermediate
2:00remaining
Output of DataParallel model forward pass
Consider this PyTorch code snippet using DataParallel:
PyTorch
import torch
import torch.nn as nn

class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(2, 1)
    def forward(self, x):
        return self.linear(x)

model = SimpleModel()
model = nn.DataParallel(model)
input_tensor = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
output = model(input_tensor)
print(output.shape)
ARuntimeError due to input on CPU but model on GPU
Btorch.Size([1, 2])
Ctorch.Size([2, 1])
Dtorch.Size([4, 1])
Attempts:
2 left
💡 Hint
DataParallel splits the batch dimension but output shape matches input batch size.
Hyperparameter
advanced
1:30remaining
Choosing batch size with DataParallel
You want to train a model using DataParallel on 4 GPUs. Your original batch size is 64. What is the best way to set the batch size when using DataParallel?
AKeep batch size 64; DataParallel will split it automatically across GPUs.
BSet batch size to 16 because DataParallel requires batch size per GPU.
CIncrease batch size to 256 to fully use all GPUs.
DSet batch size to 1 because DataParallel only supports batch size 1.
Attempts:
2 left
💡 Hint
Think about how DataParallel handles the batch dimension internally.
🔧 Debug
advanced
2:00remaining
Why does this DataParallel code raise an error?
Look at this code snippet:
PyTorch
import torch
import torch.nn as nn

model = nn.Linear(10, 5)
model = nn.DataParallel(model)
input_tensor = torch.randn(3, 10).cuda()
output = model(input_tensor)
print(output)
ARaises AttributeError because DataParallel is not imported.
BRuns fine and prints output tensor.
CRaises TypeError because input tensor shape is wrong.
DRaises RuntimeError because model is on CPU but input is on GPU.
Attempts:
2 left
💡 Hint
Check where the model parameters are located before wrapping with DataParallel.
Model Choice
expert
2:30remaining
Best model wrapping for multi-GPU training
You have a complex model with custom layers and want to train on multiple GPUs. Which approach is best to ensure correct gradient updates and efficient training?
AUse nn.parallel.DistributedDataParallel with proper setup for multi-GPU training.
BTrain the model on CPU and manually split data across GPUs.
CWrap the model with nn.DataParallel before moving it to GPU.
DWrap the model with nn.DataParallel after moving it to GPU.
Attempts:
2 left
💡 Hint
Consider scalability and correctness for multi-GPU training.