What is Bounding box representation in Computer Vision?

Computer Visionml~5 mins

Bounding box representation in Computer Vision

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Introduction

A bounding box helps us show where an object is in an image by drawing a simple rectangle around it.

To mark where a cat is in a photo for a pet app.

To find cars in street images for traffic monitoring.

To detect faces in pictures for photo tagging.

To locate fruits in images for a smart farm system.

To highlight products in shopping app images.

Syntax

Computer Vision

bounding_box = (x_min, y_min, x_max, y_max)

x_min and y_min are the coordinates of the top-left corner of the box.

x_max and y_max are the coordinates of the bottom-right corner of the box.

Examples

This box starts at 50 pixels from the left and 30 pixels from the top, and ends at 200 pixels from the left and 180 pixels from the top.

Computer Vision

bbox = (50, 30, 200, 180)

This box covers the top-left 100x100 pixels of the image.

Computer Vision

bbox = (0, 0, 100, 100)

Sometimes boxes are given by starting point plus width and height instead of corners.

Computer Vision

bbox = (x, y, width, height)  # alternative format

Sample Model

This code creates a black image and draws a green rectangle using the bounding box coordinates. It saves the image as 'bbox_example.png' and prints the box coordinates.

Computer Vision

import cv2
import numpy as np

# Create a blank image
image = np.zeros((250, 250, 3), dtype=np.uint8)

# Define bounding box coordinates
bbox = (50, 30, 200, 180)  # (x_min, y_min, x_max, y_max)

# Draw the bounding box on the image
cv2.rectangle(image, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (0, 255, 0), 2)

# Save the image to file
cv2.imwrite('bbox_example.png', image)

print(f"Bounding box drawn from ({bbox[0]}, {bbox[1]}) to ({bbox[2]}, {bbox[3]})")

OutputSuccess

Important Notes

Bounding boxes are simple but powerful for locating objects quickly.

Coordinates usually start at the top-left corner of the image (0,0).

Make sure the box coordinates stay inside the image size to avoid errors.

Summary

A bounding box uses four numbers to show where an object is in an image.

It is drawn from the top-left corner to the bottom-right corner.

Bounding boxes help computers understand and find objects visually.