When using GPU tensors in PyTorch, the main goal is to speed up model training and inference. The key metric to watch is training time or inference time. Faster times mean the GPU is helping. Accuracy or loss stay the same whether on CPU or GPU, but speed improves. So, measuring how much faster your model runs on GPU compared to CPU is the most important metric here.
GPU tensors (to, cuda) in PyTorch - Model Metrics & Evaluation
GPU tensors do not affect prediction correctness directly, so confusion matrix is not relevant here. Instead, consider a simple timing comparison:
CPU time: 10 seconds
GPU time: 2 seconds
Speedup: 5x faster
This shows the benefit of moving tensors to GPU using to('cuda') or cuda().
For GPU tensors, the tradeoff is between speed and resource use. Using GPU speeds up training but uses more power and GPU memory. Using CPU is slower but uses less power and memory.
Example: Training a model on CPU takes 10 minutes, on GPU takes 2 minutes but uses more electricity and GPU memory. Choose GPU if speed is critical, CPU if resources are limited.
Good: GPU training time is significantly less than CPU time (e.g., 5x faster). Model accuracy and loss remain consistent.
Bad: No speed improvement or slower training on GPU due to overhead or incorrect tensor placement. Model errors due to mixing CPU and GPU tensors.
- Not moving all tensors and model parts to the same device causes errors.
- Measuring accuracy or loss differences caused by device change is misleading; device does not affect correctness.
- Ignoring GPU memory limits can cause out-of-memory errors, stopping training.
- Overhead of moving data to GPU can make small models slower on GPU.
Your model runs with 98% accuracy on CPU in 10 minutes. On GPU, accuracy is still 98% but training takes 12 minutes. Is this good?
Answer: No, the GPU training is slower, which means the GPU is not used efficiently. Check if tensors and model are properly moved to GPU and if GPU memory is sufficient.