PyTorchml~8 mins

Dataset class (custom datasets) in PyTorch - Model Metrics & Evaluation

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Metrics & Evaluation - Dataset class (custom datasets)

Which metric matters for Dataset class (custom datasets) and WHY

When working with custom datasets in PyTorch, the key metric is data loading correctness and efficiency. This means your dataset class must correctly load and return the right data samples and labels without errors. While this is not a model metric like accuracy, it is critical because bad data loading leads to wrong training and poor model results. Efficiency matters too, so training is not slowed down.

Confusion matrix or equivalent visualization

For dataset classes, we don't have a confusion matrix. Instead, we check data integrity by verifying the number of samples matches expectations and that each sample-label pair is correct. For example:

    Dataset size: 1000 samples
    Sample 0: image shape (3, 224, 224), label: 5
    Sample 999: image shape (3, 224, 224), label: 2

This ensures the dataset class correctly loads all data.

Tradeoff: Correctness vs Efficiency in Dataset class

There is a tradeoff between loading data correctly and loading it fast. If you load all data into memory at once, loading is fast but uses lots of RAM. If you load data on the fly, it uses less memory but can slow training. The goal is to balance correctness (no errors, right labels) with efficiency (fast enough to keep training smooth).

What "good" vs "bad" looks like for Dataset class

Good: Dataset returns correct samples and labels, length matches dataset size, no crashes during training, data shapes are consistent, and loading speed keeps up with training.

Bad: Dataset returns wrong labels, crashes on some indexes, length is wrong, data shapes vary unexpectedly, or loading is too slow causing training delays.

Common pitfalls in Dataset class metrics

Mixing up labels and data order causing wrong training signals.
Not implementing __len__ or __getitem__ correctly.
Loading all data into memory causing crashes on large datasets.
Slow data loading blocking GPU training.
Data leakage by accidentally including test data in training dataset.

Self-check question

Your custom dataset class returns 1000 samples but during training, the model gets random results and loss does not improve. What could be wrong?

Answer: The dataset might be returning wrong labels or mismatched data-label pairs. Check your __getitem__ method to ensure correct data loading.

Key Result

For custom datasets, correctness of data loading and label matching is the key metric to ensure valid training.