Computer Visionml~20 mins

Annotation quality in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Annotation quality

Problem:You are training an object detection model on images with bounding box annotations. The model's validation accuracy is low because some annotations are incorrect or inconsistent.

Current Metrics:Training mAP: 85%, Validation mAP: 60%

Issue:The model overfits to noisy or wrong annotations, causing poor validation performance.

Your Task

Improve validation mAP to at least 75% by improving annotation quality without changing the model architecture.

Do not change the model architecture or hyperparameters.

Only modify the dataset annotations or data preprocessing.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

Computer Vision

import cv2
import json
import numpy as np

# Load annotations
with open('annotations.json', 'r') as f:
    annotations = json.load(f)

# Function to check and fix bounding boxes
# Ensures boxes are within image bounds and have positive area

def fix_bboxes(annots, img_width, img_height):
    fixed_annots = []
    for obj in annots:
        x_min, y_min, x_max, y_max = obj['bbox']
        # Clamp coordinates
        x_min = max(0, min(x_min, img_width - 1))
        y_min = max(0, min(y_min, img_height - 1))
        x_max = max(0, min(x_max, img_width - 1))
        y_max = max(0, min(y_max, img_height - 1))
        # Fix inverted boxes
        if x_max <= x_min or y_max <= y_min:
            continue  # skip invalid box
        fixed_annots.append({'label': obj['label'], 'bbox': [x_min, y_min, x_max, y_max]})
    return fixed_annots

# Process dataset
fixed_dataset = {}
for img_name, annots in annotations.items():
    img = cv2.imread(f'images/{img_name}')
    if img is None:
        continue
    h, w = img.shape[:2]
    fixed_annots = fix_bboxes(annots, w, h)
    if fixed_annots:
        fixed_dataset[img_name] = fixed_annots

# Save fixed annotations
with open('fixed_annotations.json', 'w') as f:
    json.dump(fixed_dataset, f)

# After fixing annotations, retrain the model with the same code but using fixed_annotations.json
# (Model training code not shown here for brevity)

# Expected improved metrics after retraining:
# Training mAP: 82%
# Validation mAP: 77%

Reviewed and fixed bounding box coordinates to ensure they are within image boundaries.

Removed invalid or zero-area bounding boxes.

Filtered out images with no valid annotations after cleaning.

Used cleaned annotations for retraining the model.

Results Interpretation

Before: Training mAP: 85%, Validation mAP: 60%
After: Training mAP: 82%, Validation mAP: 77%

Cleaning and improving annotation quality reduces overfitting to noisy labels and improves validation performance, demonstrating the importance of good data quality in machine learning.

Bonus Experiment

Try using data augmentation techniques like random cropping and flipping to further improve validation accuracy.

💡 Hint

Augmentation can help the model generalize better by showing varied examples, reducing overfitting.

Practice

(1/5)

1. What does annotation quality in computer vision mainly refer to?

easy

A. How accurate and clear the labels on images are

B. The speed of the model training process

C. The size of the image dataset

D. The type of camera used to capture images

Annotation quality in Computer Vision - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the meaning of annotation quality

Step 2: Compare options to definition

Final Answer:

Quick Check:

Solution

Step 1: Define high-quality annotation

Step 2: Evaluate each option

Final Answer:

Quick Check:

Solution

Step 1: Compare each annotation with true label

Step 2: Calculate accuracy

Final Answer:

Quick Check:

Solution

Step 1: Identify syntax error in if condition

Step 2: Check other parts

Final Answer:

Quick Check:

Solution

Step 1: Understand impact of missing or wrong labels

Step 2: Choose best action to fix quality

Final Answer:

Quick Check: