0
0
PyTorchml~20 mins

DataLoader basics in PyTorch - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
DataLoader Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output length of batches from this DataLoader?

Given the following PyTorch DataLoader setup, how many batches will it produce?

PyTorch
from torch.utils.data import DataLoader, TensorDataset
import torch

data = torch.arange(10)
dataset = TensorDataset(data)
dataloader = DataLoader(dataset, batch_size=3, shuffle=False)

batches = list(dataloader)
print(len(batches))
A3
B4
C5
D10
Attempts:
2 left
💡 Hint

Think about how many full batches of size 3 fit into 10 items, and what happens to the leftover items.

Model Choice
intermediate
1:30remaining
Which DataLoader option enables random shuffling of data each epoch?

You want your model to see data in a different order every time you train. Which DataLoader parameter should you set?

Adrop_last=True
Bbatch_size=1
Cnum_workers=4
Dshuffle=True
Attempts:
2 left
💡 Hint

Think about which option changes the order of data samples.

Hyperparameter
advanced
2:00remaining
What effect does setting drop_last=True have in DataLoader?

Consider a dataset with 10 samples and batch_size=4. What happens if drop_last=True?

AThe last batch with fewer than 4 samples is dropped, so only 2 batches are returned.
BThe last batch is padded with zeros to reach batch size 4.
CAll batches are returned, including the last smaller batch.
DDataLoader raises an error if the last batch is smaller than batch size.
Attempts:
2 left
💡 Hint

Think about what happens to incomplete batches when drop_last is True.

🔧 Debug
advanced
2:30remaining
Why does this DataLoader code raise a RuntimeError?

Examine the code below. Why does it raise a RuntimeError about workers?

PyTorch
from torch.utils.data import DataLoader, TensorDataset
import torch

data = torch.arange(5)
dataset = TensorDataset(data)
dataloader = DataLoader(dataset, batch_size=2, num_workers=2)

for batch in dataloader:
    print(batch)
ATensorDataset cannot be used with num_workers > 0.
Bbatch_size must be 1 when using num_workers > 0.
Cnum_workers > 0 requires the code to be inside a 'if __name__ == "__main__"' block on Windows.
DDataLoader does not support iteration with num_workers > 0.
Attempts:
2 left
💡 Hint

Consider platform-specific multiprocessing rules in Python.

🧠 Conceptual
expert
2:00remaining
What is the main advantage of using DataLoader with multiple workers?

Why would you set num_workers > 0 in a DataLoader? Choose the best explanation.

AIt allows loading data in parallel, reducing waiting time during training.
BIt increases the batch size automatically for faster training.
CIt guarantees data is shuffled more thoroughly each epoch.
DIt compresses data to save memory during loading.
Attempts:
2 left
💡 Hint

Think about how multiple workers affect data loading speed.