0
0
Computer Visionml~5 mins

Why detection localizes objects in images in Computer Vision

Choose your learning style9 modes available
Introduction

Detection finds where objects are in pictures. It helps us know not just what is in the image, but also where each object is located.

Finding faces in photos to blur them for privacy.
Detecting cars on the road for self-driving cars.
Counting animals in wildlife pictures by locating each one.
Spotting damaged parts in factory product images.
Tracking players on a sports field during a game.
Syntax
Computer Vision
Detection model input: image
Detection model output: list of bounding boxes with class labels and confidence scores

The input is usually a full image.

The output includes boxes that show where each object is, what it is, and how sure the model is.

Examples
The model finds a cat and a dog in the image and shows their locations with boxes.
Computer Vision
image -> detection_model -> [
  {box: [x1, y1, x2, y2], label: 'cat', confidence: 0.95},
  {box: [x1, y1, x2, y2], label: 'dog', confidence: 0.88}
]
No objects detected, so the output list is empty.
Computer Vision
image -> detection_model -> []
Sample Model

This code uses a simple face detector to find faces in an image. It draws boxes around detected faces and prints how many faces it found.

Computer Vision
import cv2
import numpy as np
from matplotlib import pyplot as plt

# Load a simple pre-trained model from OpenCV for face detection
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Load an example image
img = cv2.imread(cv2.samples.findFile('lena.jpg'))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5)

# Draw rectangles around faces
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)

# Show the number of faces detected
print(f'Faces detected: {len(faces)}')
OutputSuccess
Important Notes

Detection models give both the object type and its position.

Localization helps in tasks like counting or tracking objects.

Bounding boxes are simple rectangles that show object location.

Summary

Detection tells us what objects are in an image and where they are.

It outputs boxes around each object with labels and confidence.

This helps in many real-world tasks like safety, counting, and monitoring.