Bird
Raised Fist0
Computer Visionml~8 mins

CV project workflow in Computer Vision - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - CV project workflow
Which metric matters for CV project workflow and WHY

In computer vision projects, the choice of metric depends on the task. For image classification, accuracy is common because it shows how many images were correctly labeled. For object detection, mean Average Precision (mAP) is key as it measures how well the model finds and labels objects. For segmentation, Intersection over Union (IoU) tells how well predicted areas match the true areas. Choosing the right metric helps you know if your model is truly learning what matters.

Confusion matrix example for image classification
      | Predicted Cat | Predicted Dog |
      |--------------|---------------|
      | True Cat: 50 | False Dog: 5  |
      | False Cat: 3 | True Dog: 42  |

      Total samples = 50 + 5 + 3 + 42 = 100

      Precision (Cat) = TP / (TP + FP) = 50 / (50 + 5) = 0.909
      Recall (Cat) = TP / (TP + FN) = 50 / (50 + 3) = 0.943
    

This matrix helps you see where the model makes mistakes and calculate metrics.

Precision vs Recall tradeoff with examples

Imagine a face recognition system for phone unlock:

  • High precision: The system rarely lets strangers in (few false accepts), but might sometimes not recognize the owner (false rejects).
  • High recall: The system always recognizes the owner (few false rejects), but might sometimes let strangers in (false accepts).

Depending on what matters more (security or convenience), you adjust the model to favor precision or recall.

What good vs bad metric values look like in CV projects

For image classification:

  • Good: Accuracy above 90%, precision and recall balanced above 85%
  • Bad: Accuracy below 70%, or very low recall meaning many true objects are missed

For object detection:

  • Good: mAP above 0.7 means the model finds and labels objects well
  • Bad: mAP below 0.4 means poor detection and many missed objects
Common pitfalls in CV metrics
  • Accuracy paradox: High accuracy can be misleading if classes are imbalanced (e.g., many background images, few objects).
  • Data leakage: Using test images in training inflates metrics falsely.
  • Overfitting: Very high training accuracy but low test accuracy means the model memorizes instead of learning.
  • Ignoring metric choice: Using accuracy for detection tasks can hide poor localization performance.
Self-check question

Your image classifier has 98% accuracy but only 12% recall on a rare class. Is it good for production?

Answer: No. The model misses most examples of the rare class (low recall), which can be critical depending on the task. High accuracy is misleading if the rare class is important.

Key Result
Choosing the right metric like accuracy, mAP, or IoU is key to correctly evaluate computer vision models and avoid misleading results.

Practice

(1/5)
1. Which step comes first in a typical computer vision project workflow?
easy
A. Monitor model performance
B. Deploy the model to production
C. Tune hyperparameters
D. Define the problem and collect data

Solution

  1. Step 1: Understand the project start

    The first step is to clearly define what problem you want to solve and gather the images or videos needed.
  2. Step 2: Recognize the order of workflow steps

    Data collection must happen before training, tuning, or deployment.
  3. Final Answer:

    Define the problem and collect data -> Option D
  4. Quick Check:

    First step = Define problem and collect data [OK]
Hint: Start with problem definition and data collection [OK]
Common Mistakes:
  • Thinking deployment is the first step
  • Skipping problem definition
  • Ignoring data collection importance
2. Which of the following is the correct syntax to split data into training and testing sets in Python using scikit-learn?
easy
A. train_test_split(data, test_size=0.2)
B. split_train_test(data, 0.2)
C. train_test(data, test=0.2)
D. train_test_split(data, test=0.2)

Solution

  1. Step 1: Recall scikit-learn function name and parameters

    The correct function is train_test_split with parameter test_size to specify test data fraction.
  2. Step 2: Check parameter correctness

    train_test_split(data, test_size=0.2) uses correct function and parameter names.
  3. Final Answer:

    train_test_split(data, test_size=0.2) -> Option A
  4. Quick Check:

    Correct function and parameter = train_test_split(data, test_size=0.2) [OK]
Hint: Remember exact function and parameter names from scikit-learn [OK]
Common Mistakes:
  • Using wrong function name
  • Using incorrect parameter names
  • Confusing test_size with test
3. Given this code snippet for training a simple CNN model, what will be the printed output after training for 1 epoch?
import tensorflow as tf
model = tf.keras.Sequential([
  tf.keras.layers.Conv2D(16, 3, activation='relu', input_shape=(28,28,1)),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs=1, batch_size=32)
print(history.history['accuracy'][0])
medium
A. An error because 'accuracy' is not in history
B. The loss value after training
C. A float value between 0 and 1 representing training accuracy
D. The number of training samples

Solution

  1. Step 1: Understand model.fit output

    The history object stores metrics per epoch. Accessing history.history['accuracy'][0] gives training accuracy after first epoch.
  2. Step 2: Confirm metric requested

    Since metrics=['accuracy'] was set, accuracy is recorded and printed as a float between 0 and 1.
  3. Final Answer:

    A float value between 0 and 1 representing training accuracy -> Option C
  4. Quick Check:

    history.history['accuracy'][0] = training accuracy [OK]
Hint: history.history['accuracy'][0] holds first epoch accuracy [OK]
Common Mistakes:
  • Confusing accuracy with loss
  • Expecting an error accessing accuracy
  • Thinking it prints sample count
4. You trained a model but it performs poorly on new images. Which step in the workflow might be causing this issue?
medium
A. Monitoring was set up correctly
B. Data preparation was insufficient or incorrect
C. Hyperparameters were tuned perfectly
D. Model deployment was done too early

Solution

  1. Step 1: Analyze poor model performance cause

    Poor results on new data often mean the model did not learn well, usually due to bad or insufficient data preparation.
  2. Step 2: Eliminate unrelated options

    Deployment timing, perfect hyperparameters, or monitoring setup do not directly cause poor initial performance.
  3. Final Answer:

    Data preparation was insufficient or incorrect -> Option B
  4. Quick Check:

    Poor performance = bad data prep [OK]
Hint: Check data prep first when model fails on new data [OK]
Common Mistakes:
  • Blaming deployment timing
  • Assuming hyperparameters are always perfect
  • Ignoring data quality issues
5. In a computer vision project, after deploying your model, you notice accuracy drops over time. What is the best next step to maintain model performance?
hard
A. Collect new data and retrain the model regularly
B. Stop monitoring since model is deployed
C. Reduce the size of the training dataset
D. Ignore the drop as normal and do nothing

Solution

  1. Step 1: Understand model drift after deployment

    Models can lose accuracy as data changes. Collecting new data and retraining helps adapt to changes.
  2. Step 2: Evaluate other options

    Stopping monitoring or ignoring drops will worsen performance. Reducing training data size is counterproductive.
  3. Final Answer:

    Collect new data and retrain the model regularly -> Option A
  4. Quick Check:

    Maintain performance = retrain with new data [OK]
Hint: Retrain model regularly with fresh data after deployment [OK]
Common Mistakes:
  • Ignoring monitoring after deployment
  • Reducing training data size
  • Assuming model never needs updates