0
0
PyTorchml~12 mins

Data transforms in PyTorch - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Data transforms

This pipeline shows how raw data is changed step-by-step to prepare it for a machine learning model. We clean, resize, and convert images into numbers the model can understand.

Data Flow - 4 Stages
1Raw Data Input
1000 images x 3 channels x 256 height x 256 widthLoad images from disk in RGB format1000 images x 3 channels x 256 height x 256 width
Image of a cat with shape (3, 256, 256)
2Resize
1000 images x 3 channels x 256 height x 256 widthResize images to 128x128 pixels1000 images x 3 channels x 128 height x 128 width
Resized cat image with shape (3, 128, 128)
3ToTensor
1000 images x 3 channels x 128 height x 128 width (PIL images)Convert images to PyTorch tensors and scale pixel values to [0,1]1000 tensors x 3 channels x 128 height x 128 width (float32)
Tensor with values between 0 and 1 representing the cat image
4Normalize
1000 tensors x 3 channels x 128 height x 128 widthSubtract mean and divide by std for each channel1000 normalized tensors x 3 channels x 128 height x 128 width
Tensor with mean 0 and std 1 per channel for the cat image
Training Trace - Epoch by Epoch

Loss
1.2 |****
1.0 |*** 
0.8 |**  
0.6 |*   
0.4 |****
     1  2  3  4  5  Epochs
EpochLoss ↓Accuracy ↑Observation
11.20.45Model starts learning with high loss and low accuracy
20.90.60Loss decreases and accuracy improves as model learns
30.70.72Model continues to improve with more training
40.50.80Loss lowers further and accuracy reaches 80%
50.40.85Training converges with good accuracy and low loss
Prediction Trace - 4 Layers
Layer 1: Input Image
Layer 2: Resize
Layer 3: ToTensor
Layer 4: Normalize
Model Quiz - 3 Questions
Test your understanding
What does the 'Normalize' step do to the image data?
AChanges pixel values to have zero mean and unit variance
BResizes the image to smaller dimensions
CConverts image to grayscale
DLoads the image from disk
Key Insight
Data transforms prepare raw images into a clean, consistent format that helps the model learn better and faster. Normalizing data centers values, making training more stable.