When training your first neural network, the main metric to watch is accuracy. Accuracy tells you how often your model guesses right. It is simple and easy to understand, making it perfect for beginners. However, if your data has many more examples of one type than another, accuracy alone might not tell the full story. In that case, also look at loss, which shows how well the model is learning during training.
First neural network in TensorFlow - Model Metrics & Evaluation
Confusion Matrix Example:
Predicted
0 1
---------
0 | 50 | 10 |
---------
1 | 5 | 35 |
---------
Here:
- True Positives (TP) = 35 (correctly predicted 1)
- True Negatives (TN) = 50 (correctly predicted 0)
- False Positives (FP) = 10 (wrongly predicted 1)
- False Negatives (FN) = 5 (missed 1)
Imagine your first neural network is a spam detector:
- Precision means: When the model says "spam", how often is it really spam? High precision means fewer good emails get marked as spam.
- Recall means: Of all the spam emails, how many did the model catch? High recall means fewer spam emails sneak into your inbox.
If you want to avoid missing spam, focus on recall. If you want to avoid losing good emails, focus on precision. Your first neural network might need tuning to find the right balance.
For a simple first neural network on balanced data:
- Good: Accuracy above 80%, loss steadily decreasing, precision and recall both above 75%.
- Bad: Accuracy near 50% (like random guessing), loss not improving, very low precision or recall (below 50%).
Good metrics mean your network is learning patterns. Bad metrics mean it might be guessing or stuck.
- Accuracy paradox: High accuracy can be misleading if one class dominates the data.
- Data leakage: If test data leaks into training, metrics look unrealistically good.
- Overfitting indicators: Training accuracy very high but test accuracy low means the model memorizes training data but fails on new data.
Your first neural network has 98% accuracy but only 12% recall on the positive class (e.g., fraud). Is it good for production? Why not?
Answer: No, it is not good. The model misses most positive cases (only 12% recall), which is critical in fraud detection. High accuracy is misleading because most data is negative. You need to improve recall to catch more fraud.