Computer-visionHow-ToBeginner · 4 min read

How to Annotate Images for Object Detection in Computer Vision

To annotate images for object detection, draw bounding boxes around each object of interest and assign a label to each box describing the object. This creates structured data that computer vision models use to learn how to find and classify objects in new images.

📐

Syntax

Image annotation for object detection typically involves creating a list of objects, each with a bounding box and a label. The bounding box is defined by coordinates (x_min, y_min, x_max, y_max) that mark the rectangle around the object. The label is a string describing the object class.

Example annotation format in Python:

annotations = [
    {"label": "cat", "bbox": [x_min, y_min, x_max, y_max]},
    {"label": "dog", "bbox": [x_min, y_min, x_max, y_max]}
]

Each bbox is a list of four numbers representing the top-left and bottom-right corners of the box.

python

annotations = [
    {"label": "cat", "bbox": [50, 30, 200, 180]},
    {"label": "dog", "bbox": [220, 40, 370, 200]}
]

💻

Example

This example shows how to annotate an image by drawing bounding boxes and labels using Python and the OpenCV library. It demonstrates creating annotations and visualizing them on the image.

python

import cv2

# Load image
image = cv2.imread('example.jpg')

# Define annotations: label and bounding box coordinates
annotations = [
    {"label": "cat", "bbox": [50, 30, 200, 180]},
    {"label": "dog", "bbox": [220, 40, 370, 200]}
]

# Draw bounding boxes and labels on the image
for ann in annotations:
    x_min, y_min, x_max, y_max = ann['bbox']
    label = ann['label']
    # Draw rectangle
    cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (0, 255, 0), 2)
    # Put label text
    cv2.putText(image, label, (x_min, y_min - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)

# Save or show the annotated image
cv2.imwrite('annotated_example.jpg', image)
print('Annotation complete and image saved as annotated_example.jpg')

Output

Annotation complete and image saved as annotated_example.jpg

⚠️

Common Pitfalls

Incorrect bounding box coordinates: Coordinates must be within image dimensions and ordered as (x_min, y_min, x_max, y_max). Mixing order causes wrong boxes.
Missing labels: Every bounding box must have a clear label describing the object class.
Overlapping boxes: Overlapping objects should each have their own box and label.
Inconsistent annotation formats: Use a consistent format (e.g., JSON, XML) across your dataset.

python

wrong_annotation = {"label": "car", "bbox": [300, 400, 100, 200]}  # x_min > x_max (wrong)

# Corrected annotation
correct_annotation = {"label": "car", "bbox": [100, 200, 300, 400]}

📊

Quick Reference

Step	Description
1. Choose annotation tool	Use tools like LabelImg, CVAT, or MakeSense.ai to draw boxes.
2. Draw bounding boxes	Mark rectangles tightly around each object.
3. Assign labels	Give each box a clear object class name.
4. Save annotations	Export in formats like Pascal VOC XML or COCO JSON.
5. Verify annotations	Check for accuracy and consistency before training.

✅

Key Takeaways

Draw bounding boxes around each object with coordinates (x_min, y_min, x_max, y_max).

Assign a clear label to every bounding box describing the object class.

Use consistent annotation formats and tools to avoid errors.

Verify annotations carefully to ensure quality training data.

Popular annotation tools include LabelImg, CVAT, and MakeSense.ai.