0
0
PyTorchml~5 mins

Albumentations integration in PyTorch

Choose your learning style9 modes available
Introduction
Albumentations helps change images in smart ways to make machine learning models better at understanding pictures.
When you want to make your image dataset bigger by changing pictures slightly.
When you want your model to learn from different views of the same image.
When you want to fix images by resizing or cropping before training.
When you want to add random changes like flipping or brightness to help your model generalize.
When you want to prepare images quickly and easily for PyTorch models.
Syntax
PyTorch
import albumentations as A
from albumentations.pytorch import ToTensorV2

transform = A.Compose([
    A.Resize(128, 128),
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(p=0.2),
    ToTensorV2()
])

transformed = transform(image=image)
image_tensor = transformed['image']
Use A.Compose to combine many image changes together.
ToTensorV2 converts images to PyTorch tensors ready for models.
Examples
This changes images by rotating up to 40 degrees and flipping half the time.
PyTorch
transform = A.Compose([
    A.Rotate(limit=40, p=0.9),
    A.HorizontalFlip(p=0.5),
    ToTensorV2()
])
This normalizes images to match common model expectations and converts to tensor.
PyTorch
transform = A.Compose([
    A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
    ToTensorV2()
])
This crops a random 100x100 patch from the image before converting to tensor.
PyTorch
transform = A.Compose([
    A.RandomCrop(width=100, height=100),
    ToTensorV2()
])
Sample Model
This code creates a random image, applies resizing, flipping, brightness and contrast changes, then converts it to a PyTorch tensor. It prints the tensor shape and value range.
PyTorch
import albumentations as A
from albumentations.pytorch import ToTensorV2
import torch
import numpy as np

# Create a dummy image (3 channels, 256x256) with random colors
image = np.random.randint(0, 256, (256, 256, 3), dtype=np.uint8)

# Define transformations
transform = A.Compose([
    A.Resize(128, 128),
    A.HorizontalFlip(p=1.0),  # Always flip for demo
    A.RandomBrightnessContrast(p=1.0),  # Always apply for demo
    ToTensorV2()
])

# Apply transformations
transformed = transform(image=image)
image_tensor = transformed['image']

# Check tensor shape and type
print(f"Tensor shape: {image_tensor.shape}")
print(f"Tensor dtype: {image_tensor.dtype}")
print(f"Tensor min value: {image_tensor.min().item():.4f}")
print(f"Tensor max value: {image_tensor.max().item():.4f}")
OutputSuccess
Important Notes
Albumentations works with images as NumPy arrays, not PyTorch tensors directly.
Always convert images to tensors with ToTensorV2 before feeding to PyTorch models.
Set probabilities (p=) to control how often each change happens during training.
Summary
Albumentations helps change images to improve model learning.
Use Compose to combine many changes like resize, flip, brightness.
Convert images to PyTorch tensors with ToTensorV2 after transformations.