The forward pass is the step where the model makes predictions by passing input data through its layers. The key metric here is loss, which measures how far the model's predictions are from the true answers. A lower loss means the model is doing better at predicting. Accuracy can also be used to see how many predictions are correct, but loss gives a more detailed view of performance during the forward pass.
Forward pass computation in PyTorch - Model Metrics & Evaluation
For classification tasks, the confusion matrix shows how well the forward pass predictions match the true labels. Here is an example with 100 samples:
| Predicted Positive | Predicted Negative |
|--------------------|--------------------|
| True Positive (TP): 40 | False Positive (FP): 5 |
| False Negative (FN): 10 | True Negative (TN): 45 |
This matrix helps calculate precision, recall, and accuracy, which reflect the quality of the forward pass predictions.
During the forward pass, the model outputs predictions that can be tuned to favor precision or recall depending on the task:
- Precision is important when false alarms are costly. For example, in spam detection, you want to avoid marking good emails as spam.
- Recall is important when missing positive cases is costly. For example, in disease detection, you want to catch as many sick patients as possible.
The forward pass outputs probabilities that can be thresholded to balance precision and recall.
Good forward pass results have:
- Low loss values (close to 0)
- High accuracy (close to 1 or 100%)
- Balanced precision and recall depending on the task
Bad forward pass results show:
- High loss values (far from 0)
- Low accuracy (close to random guessing)
- Very low precision or recall, indicating poor prediction quality
- Accuracy paradox: High accuracy can be misleading if classes are imbalanced.
- Data leakage: If test data leaks into training, forward pass metrics look unrealistically good.
- Overfitting indicators: Very low training loss but high validation loss means the model memorizes training data but fails on new data.
Your model has 98% accuracy but only 12% recall on fraud cases. Is it good for production?
Answer: No. Even though accuracy is high, the model misses most fraud cases (low recall). This means many frauds go undetected, which is risky. For fraud detection, recall is critical to catch as many frauds as possible.