0
0
PyTorchml~15 mins

DataLoader basics in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - DataLoader basics
Problem:You want to load and batch your dataset efficiently for training a neural network using PyTorch.
Current Metrics:N/A - currently loading data manually without batching or shuffling.
Issue:Manual data loading is slow and error-prone. No batching or shuffling leads to inefficient training.
Your Task
Use PyTorch DataLoader to load data in batches with shuffling to improve training efficiency.
Use the provided simple dataset (a list of numbers).
Batch size must be 4.
Enable shuffling of data.
Hint 1
Hint 2
Hint 3
Solution
PyTorch
import torch
from torch.utils.data import TensorDataset, DataLoader

# Create a simple dataset of numbers 0 to 19
data = torch.arange(20)

# Wrap data in TensorDataset (no labels needed here)
dataset = TensorDataset(data)

# Create DataLoader with batch size 4 and shuffle enabled
dataloader = DataLoader(dataset, batch_size=4, shuffle=True)

# Iterate over DataLoader and print batches
for batch_idx, (batch_data,) in enumerate(dataloader):
    print(f"Batch {batch_idx + 1}: {batch_data.tolist()}")
Wrapped raw data in TensorDataset to make it compatible with DataLoader.
Created DataLoader with batch_size=4 and shuffle=True to load data in batches and shuffle each epoch.
Used a loop to iterate over DataLoader and print batches to verify correct batching and shuffling.
Results Interpretation

Before: Data loaded manually one by one, no batching or shuffling.

After: DataLoader loads data in shuffled batches of 4, improving efficiency and randomness.

Using DataLoader in PyTorch helps load data efficiently in batches and shuffle it, which is important for faster and better training of models.
Bonus Experiment
Try using DataLoader with a custom dataset class that returns both features and labels.
💡 Hint
Create a class inheriting from torch.utils.data.Dataset and implement __len__ and __getitem__ methods.