0
0
PyTorchml~5 mins

Image dataset from folders in PyTorch

Choose your learning style9 modes available
Introduction

We use image datasets from folders to easily load and organize pictures for training machine learning models.

You have images sorted in folders by category and want to train a model to recognize them.
You want to quickly load images with labels without manually writing code to assign labels.
You need to apply transformations like resizing or normalization while loading images.
You want to prepare data for image classification tasks using PyTorch.
Syntax
PyTorch
from torchvision.datasets import ImageFolder
from torchvision import transforms

dataset = ImageFolder(root='path/to/data', transform=transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor()
]))

root is the main folder containing subfolders for each class.

transform applies changes to images when loading, like resizing or converting to tensors.

Examples
Loads images from 'data/train' folder without any transformations.
PyTorch
dataset = ImageFolder(root='data/train')
Loads images and resizes them to 128x128 pixels, then converts to tensor.
PyTorch
from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ToTensor()
])
dataset = ImageFolder(root='data/train', transform=transform)
Loads validation images and converts them directly to tensors.
PyTorch
dataset = ImageFolder(root='data/val', transform=transforms.ToTensor())
Sample Model

This code loads images from 'sample_data' folder, resizes them to 64x64 pixels, converts to tensors, and loads them in batches of 4. It prints the class names, the shape of one batch of images, and their labels.

PyTorch
from torchvision.datasets import ImageFolder
from torchvision import transforms
from torch.utils.data import DataLoader

# Define transformations: resize and convert to tensor
transform = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor()
])

# Load dataset from folder 'sample_data'
dataset = ImageFolder(root='sample_data', transform=transform)

# Create a data loader to iterate in batches
loader = DataLoader(dataset, batch_size=4, shuffle=True)

# Print class names
print('Classes:', dataset.classes)

# Get one batch of images and labels
images, labels = next(iter(loader))

# Print batch shapes and labels
print('Batch image tensor shape:', images.shape)
print('Batch labels:', labels)
OutputSuccess
Important Notes

Folder names inside the root folder become the class labels automatically.

Images should be organized as root/class_x/xxx.png, root/class_y/yyy.png, etc.

Transforms help prepare images for model input and improve training.

Summary

ImageFolder loads images from folders where each folder is a class label.

Transforms can resize and convert images to tensors automatically.

DataLoader helps to batch and shuffle data for training.