0
0
Computer Visionml~12 mins

Cutout and CutMix in Computer Vision - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Cutout and CutMix

This pipeline shows how Cutout and CutMix help improve image classification models by changing training images to make the model learn better. Cutout hides parts of images, and CutMix mixes parts of two images.

Data Flow - 4 Stages
1Original Dataset
5000 images x 32 x 32 x 3Raw images with labels5000 images x 32 x 32 x 3
Image of a cat with label 'cat'
2Cutout Augmentation
5000 images x 32 x 32 x 3Randomly mask a square patch in each image5000 images x 32 x 32 x 3
Cat image with an 8x8 black square hiding part of the cat
3CutMix Augmentation
5000 images x 32 x 32 x 3Mix patches from two images and combine labels proportionally5000 images x 32 x 32 x 3
Image with half cat and half dog, label is 0.6 cat + 0.4 dog
4Model Training
Batch of 128 images x 32 x 32 x 3Train CNN on augmented images and labelsModel weights updated
CNN learns features from images with cutout and cutmix
Training Trace - Epoch by Epoch
Loss
1.2 |*       
1.0 | *      
0.8 |  *     
0.6 |   *    
0.4 |    *   
0.2 |     *  
0.0 +--------
     1 5 10 15 20 Epochs
EpochLoss ↓Accuracy ↑Observation
11.20.45Model starts learning, loss high, accuracy low
50.80.65Loss decreases, accuracy improves with augmentation
100.50.80Model learns robust features, accuracy rises
150.350.88Loss low, accuracy high, good generalization
200.300.90Training converged with strong performance
Prediction Trace - 6 Layers
Layer 1: Input Image
Layer 2: Convolutional Layers
Layer 3: Pooling Layers
Layer 4: Fully Connected Layers
Layer 5: Softmax Activation
Layer 6: Final Prediction
Model Quiz - 3 Questions
Test your understanding
What does the Cutout augmentation do to the training images?
AChanges image colors randomly
BMixes two images together
CRandomly hides a square patch in the image
DRotates the image by 90 degrees
Key Insight
Cutout and CutMix help the model learn better by making training images more varied and challenging. This leads to lower loss and higher accuracy, showing the model generalizes well to new images.