When we make predictions with a model, we want to know how well it works. The main metrics to check are accuracy, precision, recall, and F1 score. These tell us if the model guesses right, if it finds all the important cases, and if it avoids false alarms. We choose the metric based on what matters most for the problem.
Prediction and evaluation in TensorFlow - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
| Predicted Positive | Predicted Negative |
|--------------------|--------------------|
| True Positive (TP) | False Negative (FN) |
| False Positive (FP) | True Negative (TN) |
Example:
TP = 50, FP = 10, TN = 30, FN = 10
Total samples = 100
Precision means when the model says "yes", how often is it right? High precision means few false alarms.
Recall means how many actual "yes" cases the model finds. High recall means it misses very few real cases.
Example 1: Spam filter - high precision is important so good emails are not marked as spam.
Example 2: Cancer detection - high recall is important so no cancer cases are missed.
Good values: Accuracy > 90%, Precision and Recall both above 85%, F1 score close to 1.
Bad values: Accuracy around 50% (random guessing), Precision or Recall below 50%, F1 score very low.
Note: High accuracy alone can be misleading if classes are imbalanced.
- Accuracy paradox: High accuracy can hide poor performance on rare classes.
- Data leakage: Using future or test data in training inflates metrics falsely.
- Overfitting: Very high training accuracy but low test accuracy means model memorizes data, not learns.
Your model has 98% accuracy but only 12% recall on fraud cases. Is it good for production?
Answer: No. Even though accuracy is high, the model misses 88% of fraud cases (low recall). This is bad because catching fraud is critical. The model needs improvement to find more fraud cases.
Practice
model.predict() function do in TensorFlow?Solution
Step 1: Understand the purpose of
This function is used to get the model's output predictions for new input data after training.model.predict()Step 2: Differentiate from other functions
Training usesmodel.fit(), saving usesmodel.save(), and deleting is manual memory management, none of which arepredict().Final Answer:
It gives the model's guesses on new data -> Option DQuick Check:
model.predict() = model guesses [OK]
- Confusing predict() with fit() for training
- Thinking predict() saves the model
- Assuming predict() deletes the model
X_test and y_test?Solution
Step 1: Identify the evaluation function
TensorFlow usesmodel.evaluate()to measure performance on test data.Step 2: Check other options
model.predict()makes predictions,model.fit()trains, andmodel.score()is not a TensorFlow method.Final Answer:
model.evaluate(X_test, y_test) -> Option BQuick Check:
Evaluate = measure performance [OK]
- Using predict() instead of evaluate() for metrics
- Trying to train with evaluate()
- Using non-existent model.score() method
import tensorflow as tf import numpy as np model = tf.keras.Sequential([ tf.keras.layers.Dense(1, input_shape=(1,)) ]) model.compile(optimizer='sgd', loss='mse') X = np.array([1, 2, 3, 4], dtype=float) y = np.array([2, 4, 6, 8], dtype=float) model.fit(X, y, epochs=10, verbose=0) predictions = model.predict(np.array([5.0])) print(predictions)
Solution
Step 1: Understand the model and data
The model is a simple linear layer trained to learn y = 2*x approximately.Step 2: Predict for input 5.0
After training, the model should predict close to 2*5 = 10, so output is near [[10.0]].Final Answer:
A numpy array close to [[10.0]] -> Option CQuick Check:
Prediction for 5 ≈ 10 [OK]
- Expecting exact 10 instead of approximate
- Confusing input shape causing error
- Thinking prediction returns scalar, not array
model.evaluate(X_test, y_test) but get a ValueError about mismatched shapes. What is the most likely cause?Solution
Step 1: Understand the error cause
A ValueError about shape mismatch usually means input or output data shapes don't match what the model expects.Step 2: Check other options
Not compiling causes different errors, predict() vs evaluate() is unrelated, and optimizer issues cause training errors, not shape errors.Final Answer:
The shapes of X_test and y_test do not match the model's expected input and output shapes -> Option AQuick Check:
Shape mismatch causes ValueError in evaluate() [OK]
- Ignoring shape mismatch and blaming optimizer
- Confusing predict() with evaluate() errors
- Not compiling model but blaming shape error
X_test1, y_test1 and X_test2, y_test2. Which approach correctly compares their accuracy using TensorFlow?Solution
Step 1: Understand evaluation for performance
model.evaluate()returns loss and metrics on test data without training, ideal for comparing performance.Step 2: Why other options are incorrect
Comparing raw predictions is not a direct accuracy measure; retraining or fitting on test sets changes the model and is not a fair comparison.Final Answer:
Use model.evaluate() on both test sets separately and compare the returned loss or accuracy values -> Option AQuick Check:
Evaluate test sets separately for fair comparison [OK]
- Comparing raw predictions without metrics
- Retraining on test data for comparison
- Using fit() on test data instead of evaluate()
