Challenge - 5 Problems
Dataset Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of __getitem__ in a PyTorch Dataset
Consider this PyTorch Dataset class. What will be the output of
dataset[2]?PyTorch
import torch from torch.utils.data import Dataset class SimpleDataset(Dataset): def __init__(self): self.data = [10, 20, 30, 40, 50] def __len__(self): return len(self.data) def __getitem__(self, idx): return self.data[idx] * 2 dataset = SimpleDataset() output = dataset[2]
Attempts:
2 left
💡 Hint
Remember __getitem__ returns data at index multiplied by 2.
✗ Incorrect
The __getitem__ method returns self.data[idx] * 2. At index 2, self.data[2] is 30, so output is 60.
❓ Model Choice
intermediate2:00remaining
Choosing __len__ for a Custom Dataset
You have a dataset class with a list of 1000 images. Which implementation of
__len__ correctly returns the dataset size?Attempts:
2 left
💡 Hint
The length should reflect the number of items in the images list.
✗ Incorrect
The __len__ method should return the number of items in the dataset, which is the length of the images list.
🔧 Debug
advanced2:00remaining
Debugging __getitem__ Index Error
This Dataset class raises an IndexError when accessing
dataset[5]. Why?PyTorch
class MyDataset: def __init__(self): self.data = [1, 2, 3, 4, 5] def __len__(self): return len(self.data) def __getitem__(self, idx): return self.data[idx + 1] dataset = MyDataset() output = dataset[5]
Attempts:
2 left
💡 Hint
Check how the index is used inside __getitem__.
✗ Incorrect
dataset[5] calls __getitem__(5), which returns self.data[6]. Since data has length 5, index 6 is out of range causing IndexError.
❓ Hyperparameter
advanced2:00remaining
Effect of __len__ on DataLoader Batching
If a Dataset's
__len__ returns 50 but actually contains 100 samples, what happens when using a DataLoader with batch size 10?Attempts:
2 left
💡 Hint
DataLoader uses __len__ to know dataset size.
✗ Incorrect
DataLoader relies on __len__ to know how many samples to load. If __len__ returns 50, it will load only 50 samples in 5 batches of 10.
🧠 Conceptual
expert2:00remaining
Why Implement Both __getitem__ and __len__ in PyTorch Dataset?
Why is it important to implement both
__getitem__ and __len__ methods in a PyTorch Dataset class?Attempts:
2 left
💡 Hint
Think about how DataLoader interacts with Dataset.
✗ Incorrect
DataLoader calls __len__ to know how many samples exist and __getitem__ to get each sample by index during training.