0
0
PyTorchml~15 mins

__getitem__ and __len__ in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - __getitem__ and __len__
Problem:You have a custom dataset class in PyTorch but the DataLoader throws errors or does not load data properly.
Current Metrics:DataLoader raises IndexError or returns empty batches.
Issue:The __getitem__ and __len__ methods are not implemented correctly or missing, causing data loading issues.
Your Task
Implement __getitem__ and __len__ methods correctly in a PyTorch Dataset class so that DataLoader can load data batches without errors.
You must use PyTorch Dataset class structure.
Do not change the data source (a list of samples).
Keep the data loading simple (no transformations needed).
Hint 1
Hint 2
Hint 3
Solution
PyTorch
import torch
from torch.utils.data import Dataset, DataLoader

class CustomDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        return self.data[idx]

# Example data
samples = [torch.tensor([i]) for i in range(10)]

# Create dataset and dataloader
dataset = CustomDataset(samples)
dataloader = DataLoader(dataset, batch_size=3, shuffle=False)

# Iterate and print batches
for batch in dataloader:
    print(batch)
Implemented __len__ to return the length of the data list.
Implemented __getitem__ to return the data sample at the given index.
Created a simple dataset with 10 tensor samples.
Used DataLoader to load data in batches of 3.
Results Interpretation

Before: DataLoader raised errors or returned empty batches due to missing or incorrect __getitem__ and __len__.

After: DataLoader successfully loads batches of data with correct sizes, no errors.

Implementing __len__ and __getitem__ correctly in a PyTorch Dataset is essential for DataLoader to work properly and load data in batches.
Bonus Experiment
Modify __getitem__ to return both the data sample and its index as a tuple.
💡 Hint
Change the return statement in __getitem__ to return (self.data[idx], idx).