Batch size and epochs affect how well a model learns. The key metrics to watch are training loss and validation loss. These show if the model is improving or just memorizing data. Also, accuracy on validation data helps check if the model generalizes well. We want low loss and high accuracy on validation data to know the training is effective.
Batch size and epochs in TensorFlow - Model Metrics & Evaluation
For classification tasks, the confusion matrix helps see how batch size and epochs affect predictions. Here is an example after training:
| Predicted Positive | Predicted Negative |
|--------------------|--------------------|
| True Positive (TP): 80 | False Negative (FN): 20 |
| False Positive (FP): 10 | True Negative (TN): 90 |
Total samples = 80 + 20 + 10 + 90 = 200
Precision = 80 / (80 + 10) = 0.89
Recall = 80 / (80 + 20) = 0.80
F1 Score = 2 * (0.89 * 0.80) / (0.89 + 0.80) ≈ 0.84
Choosing batch size and epochs changes model learning speed and quality. A small batch size can make training noisy but may help find better solutions, improving recall (finding more true positives). A large batch size trains faster but might miss some true positives, lowering recall.
More epochs let the model learn longer. Too few epochs cause underfitting (low recall and precision). Too many epochs cause overfitting (high precision on training but low recall on new data).
Example:
- Batch size 32, epochs 5: Precision 0.85, Recall 0.75 (underfitting)
- Batch size 32, epochs 20: Precision 0.89, Recall 0.80 (balanced)
- Batch size 128, epochs 20: Precision 0.92, Recall 0.70 (overfitting, missing positives)
Good:
- Validation loss decreases and stabilizes
- Validation accuracy improves and stays high
- Precision and recall are balanced (both above 0.8)
- No big gap between training and validation metrics (no overfitting)
Bad:
- Validation loss increases or fluctuates wildly
- Validation accuracy is low or drops after some epochs
- Precision very high but recall very low (or vice versa)
- Training metrics much better than validation (overfitting)
- Too large batch size: Can cause poor generalization and get stuck in bad solutions.
- Too few epochs: Model underfits, missing patterns in data.
- Too many epochs: Model overfits, memorizing training data but failing on new data.
- Ignoring validation metrics: Only watching training loss can hide overfitting.
- Data leakage: If validation data leaks into training, metrics look falsely good.
Your model has 98% accuracy but only 12% recall on fraud cases. Is it good for production? Why or why not?
Answer: No, it is not good. High accuracy can be misleading if fraud cases are rare. Low recall means the model misses most frauds, which is dangerous. For fraud detection, high recall is critical to catch as many frauds as possible.