How to Do Object Detection in Python for Computer Vision
To do
object detection in Python for computer vision, you can use libraries like OpenCV with pre-trained models such as YOLO or SSD. These models detect objects by analyzing images and returning bounding boxes with labels and confidence scores.Syntax
Object detection in Python typically involves loading a pre-trained model, preparing the input image, running the model to detect objects, and then processing the output to draw bounding boxes and labels.
- Load model: Load weights and configuration files.
- Prepare input: Convert image to required format (e.g., blob).
- Run detection: Pass input to the model to get detections.
- Process output: Extract bounding boxes, class IDs, and confidence scores.
python
import cv2 # Load pre-trained YOLO model net = cv2.dnn.readNetFromDarknet('yolov3.cfg', 'yolov3.weights') # Load image image = cv2.imread('image.jpg') # Prepare input blob blob = cv2.dnn.blobFromImage(image, 1/255.0, (416, 416), swapRB=True, crop=False) net.setInput(blob) # Get output layer names layer_names = net.getLayerNames() output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()] # Run forward pass outputs = net.forward(output_layers) # Process outputs to get bounding boxes, confidences, class IDs # (processing code goes here)
Example
This example shows how to detect objects in an image using OpenCV and YOLOv3. It loads the model, processes the image, detects objects, and draws bounding boxes with labels.
python
import cv2 import numpy as np # Load YOLO net = cv2.dnn.readNetFromDarknet('yolov3.cfg', 'yolov3.weights') classes = [] with open('coco.names', 'r') as f: classes = [line.strip() for line in f.readlines()] # Load image img = cv2.imread('image.jpg') height, width, _ = img.shape # Create blob blob = cv2.dnn.blobFromImage(img, 1/255.0, (416, 416), swapRB=True, crop=False) net.setInput(blob) # Get output layers layer_names = net.getLayerNames() output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()] # Forward pass outs = net.forward(output_layers) class_ids = [] confidences = [] bboxes = [] # Process detections for out in outs: for detection in out: scores = detection[5:] class_id = np.argmax(scores) confidence = scores[class_id] if confidence > 0.5: center_x = int(detection[0] * width) center_y = int(detection[1] * height) w = int(detection[2] * width) h = int(detection[3] * height) x = int(center_x - w / 2) y = int(center_y - h / 2) bboxes.append([x, y, w, h]) confidences.append(float(confidence)) class_ids.append(class_id) # Non-max suppression to remove overlaps indexes = cv2.dnn.NMSBoxes(bboxes, confidences, 0.5, 0.4) # Draw bounding boxes font = cv2.FONT_HERSHEY_PLAIN for i in indexes.flatten(): x, y, w, h = bboxes[i] label = str(classes[class_ids[i]]) confidence = confidences[i] color = (0, 255, 0) cv2.rectangle(img, (x, y), (x + w, y + h), color, 2) cv2.putText(img, f'{label} {confidence:.2f}', (x, y - 5), font, 1, color, 1) # Save or show image cv2.imwrite('output.jpg', img) print('Object detection complete, output saved as output.jpg')
Output
Object detection complete, output saved as output.jpg
Common Pitfalls
- Wrong model files: Using incompatible or missing config and weights files causes errors.
- Incorrect input size: Not resizing images to model's expected size (e.g., 416x416) reduces accuracy.
- Ignoring confidence threshold: Not filtering low-confidence detections leads to many false positives.
- Skipping non-max suppression: Overlapping boxes clutter results without this step.
- File paths: Incorrect paths to model or image files cause failures.
python
import cv2 # Wrong way: Not resizing image or creating blob properly image = cv2.imread('image.jpg') net = cv2.dnn.readNetFromDarknet('yolov3.cfg', 'yolov3.weights') net.setInput(image) # Incorrect: should create blob first outputs = net.forward() # This will cause error or bad results # Right way: blob = cv2.dnn.blobFromImage(image, 1/255.0, (416, 416), swapRB=True, crop=False) net.setInput(blob) outputs = net.forward(net.getUnconnectedOutLayersNames())
Quick Reference
- Use
cv2.dnn.readNetFromDarknet()to load YOLO models. - Prepare input with
cv2.dnn.blobFromImage()resizing to 416x416. - Run detection with
net.forward()on output layers. - Filter detections by confidence > 0.5.
- Apply non-max suppression with
cv2.dnn.NMSBoxes()to remove overlaps. - Draw bounding boxes with
cv2.rectangle()and labels withcv2.putText().
Key Takeaways
Use pre-trained models like YOLO with OpenCV's DNN module for easy object detection in Python.
Always preprocess images by creating a blob with the correct size and scale before detection.
Filter detections by confidence and apply non-max suppression to get clean results.
Ensure correct paths and compatible model files to avoid errors.
Drawing bounding boxes and labels helps visualize detected objects clearly.