PyTorchml~12 mins

Train/val/test split in PyTorch - Model Pipeline Trace

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Model Pipeline - Train/val/test split

This pipeline splits the dataset into three parts: training, validation, and testing. The training set teaches the model, the validation set helps tune it, and the test set checks how well it learned.

Data Flow - 2 Stages

1Original Dataset

1000 rows x 10 columns→Start with full dataset→1000 rows x 10 columns

[[5.1, 3.5, ..., 0], [4.9, 3.0, ..., 1], ...]

↓

2Train/Val/Test Split

1000 rows x 10 columns→Split dataset into 70% train, 15% val, 15% test→Train: 700 rows x 10 columns, Val: 150 rows x 10 columns, Test: 150 rows x 10 columns

Train sample: [5.1, 3.5, ..., 0], Val sample: [6.7, 3.1, ..., 1], Test sample: [5.9, 3.0, ..., 0]

Training Trace - Epoch by Epoch

Loss
1.0 |****
0.8 |****
0.6 |****
0.4 |****
0.2 |****
0.0 +----
     1 2 3 4 5 Epochs

Epoch	Loss ↓	Accuracy ↑	Observation
1	0.85	0.60	Model starts learning with moderate loss and accuracy.
2	0.65	0.72	Loss decreases and accuracy improves as model learns.
3	0.50	0.80	Model shows good progress with lower loss and higher accuracy.
4	0.40	0.85	Training continues to improve with steady loss decrease.
5	0.35	0.88	Model converges with low loss and high accuracy.

Prediction Trace - 3 Layers

Layer 1: Input Sample

Layer 2: Model Forward Pass

Layer 3: Prediction

Model Quiz - 3 Questions

Test your understanding

Why do we split data into train, validation, and test sets?

ATo train, tune, and fairly evaluate the model

BTo increase dataset size

CTo reduce model complexity

DTo speed up training only

Key Insight

Splitting data into train, validation, and test sets helps the model learn well, tune hyperparameters, and fairly measure performance on unseen data.