0
0
PyTorchml~12 mins

Data augmentation with transforms in PyTorch - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Data augmentation with transforms

This pipeline shows how data augmentation uses image transforms to create varied training images. This helps the model learn better by seeing different versions of the same image.

Data Flow - 3 Stages
1Original Dataset
1000 images x 3 channels x 32 height x 32 widthLoad raw images from dataset1000 images x 3 channels x 32 height x 32 width
Image of a cat with size 32x32 pixels and 3 color channels
2Apply Data Augmentation Transforms
1000 images x 3 channels x 32 height x 32 widthRandom horizontal flip, random rotation, random crop1000 images x 3 channels x 32 height x 32 width
Original cat image flipped horizontally and slightly rotated
3Batch Preparation
1000 images x 3 channels x 32 height x 32 widthGroup images into batches of 10010 batches x 100 images x 3 channels x 32 height x 32 width
Batch 1 contains 100 augmented cat and dog images
Training Trace - Epoch by Epoch

Loss
1.2 |*       
1.0 | *      
0.8 |  *     
0.6 |   *    
0.4 |    *   
    +---------
     1 2 3 4 5
     Epochs
EpochLoss ↓Accuracy ↑Observation
11.20.45Model starts learning with high loss and low accuracy
20.90.60Loss decreases and accuracy improves as model sees augmented data
30.70.72Model learns better features due to varied augmented images
40.550.80Loss continues to drop, accuracy rises steadily
50.450.85Model converges with good accuracy on augmented data
Prediction Trace - 5 Layers
Layer 1: Input Image
Layer 2: Random Horizontal Flip
Layer 3: Random Rotation
Layer 4: Random Crop
Layer 5: Normalized Tensor
Model Quiz - 3 Questions
Test your understanding
Why do we apply random horizontal flips during data augmentation?
ATo convert images to grayscale
BTo reduce the image size
CTo help the model learn to recognize objects flipped left to right
DTo increase the number of color channels
Key Insight
Data augmentation creates varied versions of images, helping the model learn more robust features. This leads to better accuracy and lower loss during training because the model sees many different views of the same objects.