What is Image augmentation transforms in Computer Vision?

Computer Visionml~5 mins

Image augmentation transforms in Computer Vision

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Introduction

Image augmentation transforms help create more varied pictures from a few images. This makes machine learning models better at understanding new pictures.

When you have a small number of images to train a model.

When you want your model to recognize objects from different angles or lighting.

When you want to reduce overfitting by showing the model many versions of the same image.

When you want to simulate real-world changes like rotation, flipping, or zooming.

When you want to improve model accuracy without collecting more data.

Syntax

Computer Vision

transform = torchvision.transforms.Compose([
    torchvision.transforms.RandomHorizontalFlip(p=0.5),
    torchvision.transforms.RandomRotation(degrees=30),
    torchvision.transforms.ColorJitter(brightness=0.2, contrast=0.2),
    torchvision.transforms.ToTensor()
])

Use Compose to chain multiple transforms together.

Each transform changes the image in a simple way, like flipping or rotating.

Examples

Always flips the image horizontally.

Computer Vision

transform = torchvision.transforms.RandomHorizontalFlip(p=1.0)

Rotates the image randomly within ±45 degrees.

Computer Vision

transform = torchvision.transforms.RandomRotation(degrees=45)

Randomly changes the brightness of the image.

Computer Vision

transform = torchvision.transforms.ColorJitter(brightness=0.5)

Randomly crops and resizes the image to 224x224 pixels, then converts it to a tensor.

Computer Vision

transform = torchvision.transforms.Compose([
    torchvision.transforms.RandomResizedCrop(224),
    torchvision.transforms.ToTensor()
])

Sample Model

This code loads an image from the internet, applies several augmentation transforms, and prints the shape and type of the resulting tensor. The image is flipped, rotated, brightness and contrast changed, then converted to a tensor.

Computer Vision

import torch
from torchvision import transforms
from PIL import Image
import requests
from io import BytesIO

# Load an example image from the web
url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/PNG_transparency_demonstration_1.png/640px-PNG_transparency_demonstration_1.png'
response = requests.get(url)
img = Image.open(BytesIO(response.content))

# Define augmentation transforms
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=1.0),
    transforms.RandomRotation(degrees=30),
    transforms.ColorJitter(brightness=0.3, contrast=0.3),
    transforms.ToTensor()
])

# Apply transforms
augmented_img = transform(img)

# Show shape and type
print(f"Augmented image tensor shape: {augmented_img.shape}")
print(f"Tensor type: {augmented_img.dtype}")

OutputSuccess

Important Notes

Transforms like RandomHorizontalFlip and RandomRotation add variety to training images.

Always convert images to tensors before feeding them to models.

Augmentation should be applied only to training data, not validation or test data.

Summary

Image augmentation creates new images by changing originals slightly.

Transforms include flipping, rotating, cropping, and color changes.

Using augmentation helps models learn better and avoid overfitting.

Practice

(1/5)

1. What is the main purpose of image augmentation in training machine learning models?

easy

A. To reduce the size of the training dataset

B. To remove noise from images

C. To create more varied training images by modifying originals

D. To convert images to grayscale only

Image augmentation transforms in Computer Vision

Start learning this pattern below

Practice

Solution

Step 1: Understand image augmentation

Step 2: Purpose in training

Final Answer:

Quick Check:

Solution

Step 1: Recall torchvision syntax

Step 2: Check options

Final Answer:

Quick Check:

Solution

Step 1: Analyze each transform step

Step 2: Determine output tensor shape

Final Answer:

Quick Check:

Solution

Step 1: Check torchvision transform names

Step 2: Identify correct usage

Final Answer:

Quick Check:

Solution

Step 1: Understand augmentation goals

Step 2: Evaluate options

Step 3: Check other options

Final Answer:

Quick Check: