0
0
PyTorchml~12 mins

Train/val/test split in PyTorch - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Train/val/test split

This pipeline splits the dataset into three parts: training, validation, and testing. The training set teaches the model, the validation set helps tune it, and the test set checks how well it learned.

Data Flow - 2 Stages
1Original Dataset
1000 rows x 10 columnsStart with full dataset1000 rows x 10 columns
[[5.1, 3.5, ..., 0], [4.9, 3.0, ..., 1], ...]
2Train/Val/Test Split
1000 rows x 10 columnsSplit dataset into 70% train, 15% val, 15% testTrain: 700 rows x 10 columns, Val: 150 rows x 10 columns, Test: 150 rows x 10 columns
Train sample: [5.1, 3.5, ..., 0], Val sample: [6.7, 3.1, ..., 1], Test sample: [5.9, 3.0, ..., 0]
Training Trace - Epoch by Epoch
Loss
1.0 |****
0.8 |****
0.6 |****
0.4 |****
0.2 |****
0.0 +----
     1 2 3 4 5 Epochs
EpochLoss ↓Accuracy ↑Observation
10.850.60Model starts learning with moderate loss and accuracy.
20.650.72Loss decreases and accuracy improves as model learns.
30.500.80Model shows good progress with lower loss and higher accuracy.
40.400.85Training continues to improve with steady loss decrease.
50.350.88Model converges with low loss and high accuracy.
Prediction Trace - 3 Layers
Layer 1: Input Sample
Layer 2: Model Forward Pass
Layer 3: Prediction
Model Quiz - 3 Questions
Test your understanding
Why do we split data into train, validation, and test sets?
ATo train, tune, and fairly evaluate the model
BTo increase dataset size
CTo reduce model complexity
DTo speed up training only
Key Insight
Splitting data into train, validation, and test sets helps the model learn well, tune hyperparameters, and fairly measure performance on unseen data.