We create custom object detection datasets to teach computers how to find specific things in images. This helps computers recognize objects that matter to us.
0
0
Custom object detection dataset in Computer Vision
Introduction
You want a computer to find your own special objects, like your pet or your car.
You have images from your workplace and want to detect tools or products.
You want to build a smart camera that spots certain items in real time.
You need to train a model to detect objects that are not in public datasets.
You want to improve safety by detecting hazards in images from your environment.
Syntax
Computer Vision
Dataset structure: - Images folder: contains all images (e.g., image1.jpg, image2.jpg) - Annotations folder: contains label files (e.g., image1.txt, image2.txt) Annotation format (YOLO style example): <class_id> <x_center> <y_center> <width> <height> All values are normalized between 0 and 1 relative to image size.
Annotations must match image filenames exactly except for extension.
Coordinates are usually normalized to keep dataset flexible for different image sizes.
Examples
This means image1.jpg has two objects: class 0 at center (50%,50%) with width 20% and height 30%, and class 1 at center (70%,80%) with width and height 10%.
Computer Vision
Annotations for image1.txt: 0 0.5 0.5 0.2 0.3 1 0.7 0.8 0.1 0.1
Images and labels are stored separately but linked by filename.
Computer Vision
Folder structure: /images/image1.jpg /images/image2.jpg /labels/image1.txt /labels/image2.txt
Sample Model
This code loads images and their labels from folders. It reads bounding boxes in normalized form and converts them to pixel values. Then it prints the objects found in each image.
Computer Vision
import os from PIL import Image def load_dataset(image_dir, label_dir): dataset = [] for filename in os.listdir(image_dir): if filename.endswith('.jpg') or filename.endswith('.png'): image_path = os.path.join(image_dir, filename) label_path = os.path.join(label_dir, filename.rsplit('.',1)[0] + '.txt') if os.path.exists(label_path): with open(label_path, 'r') as f: labels = [line.strip().split() for line in f.readlines()] image = Image.open(image_path) width, height = image.size objects = [] for label in labels: class_id, x_c, y_c, w, h = label x_c, y_c, w, h = map(float, (x_c, y_c, w, h)) # Convert normalized to pixel coordinates x_center = x_c * width y_center = y_c * height box_width = w * width box_height = h * height objects.append({ 'class_id': int(class_id), 'bbox': [x_center, y_center, box_width, box_height] }) dataset.append({'image': filename, 'objects': objects}) return dataset # Example usage image_dir = 'images' label_dir = 'labels' dataset = load_dataset(image_dir, label_dir) for data in dataset: print(f"Image: {data['image']}") for obj in data['objects']: print(f" Class {obj['class_id']} at {obj['bbox']}")
OutputSuccess
Important Notes
Make sure your annotation format matches the model you plan to use.
Keep your dataset balanced with enough examples for each class.
Use tools like labelImg or makesense.ai to create annotations easily.
Summary
Custom datasets help computers learn to find your special objects.
Annotations link images to object locations using normalized coordinates.
Organize images and labels carefully for smooth training.