PyTorchml~12 mins

Data transforms in PyTorch - Model Pipeline Trace

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Model Pipeline - Data transforms

This pipeline shows how raw data is changed step-by-step to prepare it for a machine learning model. We clean, resize, and convert images into numbers the model can understand.

Data Flow - 4 Stages

1Raw Data Input

1000 images x 3 channels x 256 height x 256 width→Load images from disk in RGB format→1000 images x 3 channels x 256 height x 256 width

Image of a cat with shape (3, 256, 256)

↓

2Resize

1000 images x 3 channels x 256 height x 256 width→Resize images to 128x128 pixels→1000 images x 3 channels x 128 height x 128 width

Resized cat image with shape (3, 128, 128)

↓

3ToTensor

1000 images x 3 channels x 128 height x 128 width (PIL images)→Convert images to PyTorch tensors and scale pixel values to [0,1]→1000 tensors x 3 channels x 128 height x 128 width (float32)

Tensor with values between 0 and 1 representing the cat image

↓

4Normalize

1000 tensors x 3 channels x 128 height x 128 width→Subtract mean and divide by std for each channel→1000 normalized tensors x 3 channels x 128 height x 128 width

Tensor with mean 0 and std 1 per channel for the cat image

Training Trace - Epoch by Epoch


Loss
1.2 |****
1.0 |*** 
0.8 |**  
0.6 |*   
0.4 |****
     1  2  3  4  5  Epochs

Epoch	Loss ↓	Accuracy ↑	Observation
1	1.2	0.45	Model starts learning with high loss and low accuracy
2	0.9	0.60	Loss decreases and accuracy improves as model learns
3	0.7	0.72	Model continues to improve with more training
4	0.5	0.80	Loss lowers further and accuracy reaches 80%
5	0.4	0.85	Training converges with good accuracy and low loss

Prediction Trace - 4 Layers

Layer 1: Input Image

Layer 2: Resize

Layer 3: ToTensor

Layer 4: Normalize

Model Quiz - 3 Questions

Test your understanding

What does the 'Normalize' step do to the image data?

AChanges pixel values to have zero mean and unit variance

BResizes the image to smaller dimensions

CConverts image to grayscale

DLoads the image from disk

Key Insight

Data transforms prepare raw images into a clean, consistent format that helps the model learn better and faster. Normalizing data centers values, making training more stable.