We use data loading with torchvision to easily get images ready for training AI models. It helps us organize and prepare pictures in a simple way.
Data loading with torchvision in Computer Vision
from torchvision import datasets, transforms from torch.utils.data import DataLoader # Define image transformations transform = transforms.Compose([ transforms.Resize((28, 28)), transforms.ToTensor() ]) # Load dataset dataset = datasets.ImageFolder(root='path_to_images', transform=transform) # Create data loader loader = DataLoader(dataset, batch_size=32, shuffle=True)
datasets.ImageFolder loads images from folders where each folder name is a label.
transforms.Compose chains image changes like resizing and converting to numbers.
transform = transforms.Compose([
transforms.Resize((64, 64)),
transforms.ToTensor()
])
dataset = datasets.ImageFolder('data/train', transform=transform)
loader = DataLoader(dataset, batch_size=16, shuffle=True)transform = transforms.ToTensor() dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) loader = DataLoader(dataset, batch_size=64, shuffle=True)
This program loads the MNIST dataset, resizes images to 28x28, converts them to tensors, and loads them in batches of 32. It prints the shape of one batch and the first 5 labels.
import torch from torchvision import datasets, transforms from torch.utils.data import DataLoader # Define transformations to resize and convert images to tensors transform = transforms.Compose([ transforms.Resize((28, 28)), transforms.ToTensor() ]) # Load MNIST dataset (handwritten digits) dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform) # Create data loader with batch size 32 loader = DataLoader(dataset, batch_size=32, shuffle=True) # Get one batch of images and labels images, labels = next(iter(loader)) print(f'Batch image tensor shape: {images.shape}') print(f'Batch labels tensor shape: {labels.shape}') print(f'First 5 labels in batch: {labels[:5].tolist()}')
Always use transforms.ToTensor() to convert images to numbers the model can understand.
Shuffling data helps the model learn better by mixing images each time.
Batch size controls how many images the model sees at once; smaller batches use less memory.
Use torchvision's datasets and DataLoader to easily load and prepare image data.
Apply transforms to resize and convert images before training.
Load data in batches and shuffle to improve training efficiency and quality.