Bird
Raised Fist0
Computer Visionml~8 mins

Evaluation and confusion matrix in Computer Vision - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Evaluation and confusion matrix
Which metric matters for this concept and WHY

In computer vision, especially for classification tasks, the confusion matrix helps us see how well the model predicts each class. Key metrics like accuracy, precision, recall, and F1 score come from this matrix. They tell us if the model is good at finding the right objects (high recall) and if its guesses are usually correct (high precision). This is important because some mistakes are worse than others depending on the task.

Confusion matrix or equivalent visualization (ASCII)
    Confusion Matrix Example (3 classes: Cat, Dog, Rabbit)

                 Predicted
               | Cat | Dog | Rabbit |
    Actual  ---+-----+-----+--------+
    Cat       | 50  |  2  |   3    |
    Dog       |  4  | 45  |   1    |
    Rabbit    |  2  |  3  |  40    |

    Explanation:
    - 50 images of cats correctly predicted as cats (True Positives for Cat)
    - 2 cats wrongly predicted as dogs (False Negatives for Cat, False Positives for Dog)
    - And so on for other classes.
    
Precision vs Recall tradeoff with concrete examples

Imagine a model that detects cats in photos.

  • High precision: When the model says "this is a cat," it is almost always right. Few wrong cat guesses. Good if you want to avoid false alarms.
  • High recall: The model finds almost all cats in the photos, even if it sometimes mistakes other animals for cats. Good if missing a cat is bad.

For example, if you want to find all cats for a rescue mission, recall is more important. But if you want to tag only real cats in a photo album, precision matters more.

What "good" vs "bad" metric values look like for this use case

Good metrics for a balanced computer vision classifier might be:

  • Accuracy above 90% on a balanced dataset
  • Precision and recall both above 85%
  • F1 score close to precision and recall, showing balance

Bad metrics might be:

  • Accuracy high but recall very low (model misses many objects)
  • Precision very low (many false alarms)
  • Confusion matrix shows many misclassifications between similar classes
Metrics pitfalls (accuracy paradox, data leakage, overfitting indicators)
  • Accuracy paradox: High accuracy can be misleading if classes are imbalanced. For example, if 95% of images are dogs, a model that always guesses dog gets 95% accuracy but is useless.
  • Data leakage: If test images are too similar or come from training data, metrics look better but model won't work well in real life.
  • Overfitting: Very high training accuracy but low test accuracy means the model memorizes training images but can't generalize.
Self-check: Your model has 98% accuracy but 12% recall on fraud. Is it good?

No, it is not good for fraud detection. The 98% accuracy is misleading because fraud cases are rare. The 12% recall means the model finds only 12% of actual frauds, missing most fraud cases. For fraud detection, high recall is critical to catch as many frauds as possible.

Key Result
Confusion matrix metrics like precision, recall, and F1 score reveal model strengths and weaknesses beyond accuracy, crucial for reliable computer vision evaluation.

Practice

(1/5)
1. What does a confusion matrix help you understand in a classification model?
easy
A. The speed of the model during training
B. How well the model predicts each class by showing true and false predictions
C. The number of layers in the model
D. The size of the input images

Solution

  1. Step 1: Understand the purpose of a confusion matrix

    A confusion matrix shows counts of correct and incorrect predictions for each class, helping evaluate classification performance.
  2. Step 2: Match the description to the options

    Only How well the model predicts each class by showing true and false predictions describes this purpose correctly, while others relate to unrelated model aspects.
  3. Final Answer:

    How well the model predicts each class by showing true and false predictions -> Option B
  4. Quick Check:

    Confusion matrix = True/False predictions summary [OK]
Hint: Confusion matrix shows correct vs wrong class predictions [OK]
Common Mistakes:
  • Confusing confusion matrix with model speed
  • Thinking it shows model architecture details
  • Assuming it shows input data size
2. Which of the following is the correct way to create a confusion matrix using scikit-learn in Python?
easy
A. confusion_matrix(y_pred)
B. confusionMatrix(y_true, y_pred)
C. conf_matrix(y_pred, y_true)
D. confusion_matrix(y_true, y_pred)

Solution

  1. Step 1: Recall the scikit-learn function signature

    The function to create a confusion matrix is confusion_matrix(y_true, y_pred) with true labels first, then predicted labels.
  2. Step 2: Check each option for correctness

    confusion_matrix(y_true, y_pred) matches the correct function and argument order. Options B, C, and D have wrong names or argument orders.
  3. Final Answer:

    confusion_matrix(y_true, y_pred) -> Option D
  4. Quick Check:

    Correct function name and argument order [OK]
Hint: Use exact function name and order: confusion_matrix(true, pred) [OK]
Common Mistakes:
  • Using wrong function name capitalization
  • Swapping true and predicted labels
  • Passing only one argument
3. Given the following code, what will be the output confusion matrix?
from sklearn.metrics import confusion_matrix

y_true = [0, 1, 0, 1, 0, 1, 1]
y_pred = [0, 0, 0, 1, 0, 1, 1]

cm = confusion_matrix(y_true, y_pred)
print(cm)
medium
A. [[3 0] [1 3]]
B. [[2 1] [0 4]]
C. [[3 1] [0 3]]
D. [[4 0] [1 2]]

Solution

  1. Step 1: Count true positives and negatives

    Class 0 true positives: y_true=0 and y_pred=0 occur 3 times; false negatives: y_true=1 but y_pred=0 occur once.
  2. Step 2: Build confusion matrix

    Matrix rows = true labels, columns = predicted labels. So cm = [[3,0],[1,3]] matches counts.
  3. Final Answer:

    [[3 0] [1 3]] -> Option A
  4. Quick Check:

    Count matches matrix entries [OK]
Hint: Count true/pred pairs carefully to fill matrix [OK]
Common Mistakes:
  • Mixing rows and columns order
  • Counting predicted labels as true labels
  • Ignoring zero counts
4. You wrote this code but got an error:
from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_pred, y_true)
print(cm)
What is the likely cause of the error?
medium
A. Using print instead of return
B. Missing import statement for confusion_matrix
C. Swapped y_pred and y_true arguments causing shape mismatch
D. y_pred and y_true are not defined variables

Solution

  1. Step 1: Check argument order for confusion_matrix

    The function expects y_true first, then y_pred. Swapping them can cause errors or wrong results.
  2. Step 2: Analyze the error cause

    Since import is present and print is valid, the likely cause is swapped arguments causing shape or value errors.
  3. Final Answer:

    Swapped y_pred and y_true arguments causing shape mismatch -> Option C
  4. Quick Check:

    Correct argument order is true labels first [OK]
Hint: Always pass true labels first, predicted second [OK]
Common Mistakes:
  • Swapping true and predicted labels
  • Forgetting to import confusion_matrix
  • Using undefined variables
5. You have a 3-class image classifier with classes A, B, and C. The confusion matrix is:
[[5 2 0]
 [1 7 1]
 [0 2 6]]
What is the precision for class B?
hard
A. 7 / (2 + 7 + 2) = 0.58
B. 7 / (1 + 7 + 1) = 0.7
C. 7 / (5 + 1 + 0) = 0.7
D. 7 / (7 + 1 + 2) = 0.58

Solution

  1. Step 1: Identify precision formula for class B

    Precision = True Positives for B / (All predicted as B). True Positives = cm[1][1] = 7.
  2. Step 2: Calculate total predicted as B

    Sum column 1: cm[0][1]=2 + cm[1][1]=7 + cm[2][1]=2 = 11. So precision = 7/11 ≈ 0.636, closest to 0.58 in 7 / (2 + 7 + 2) = 0.58.
  3. Final Answer:

    7 / (2 + 7 + 2) = 0.58 -> Option A
  4. Quick Check:

    Precision = TP / predicted positives [OK]
Hint: Precision = TP / sum of predicted class column [OK]
Common Mistakes:
  • Using row sums instead of column sums
  • Confusing precision with recall
  • Ignoring off-diagonal values in predicted class column