0
0
Computer Visionml~12 mins

Custom object detection dataset in Computer Vision - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Custom object detection dataset

This pipeline shows how a custom object detection dataset is prepared, used to train a model, and then how the model predicts objects in new images.

Data Flow - 5 Stages
1Raw dataset
1000 images x varying sizesCollect images and bounding box annotations with labels1000 images with bounding boxes and labels
Image1: cat at (50,30,150,130), dog at (200,100,300,250)
2Preprocessing
1000 images with bounding boxesResize images to 224x224, normalize pixel values, adjust bounding boxes accordingly1000 images 224x224 x 3 channels with normalized pixel values and updated bounding boxes
Image1 resized to 224x224, cat box now (20,12,60,52)
3Data augmentation
1000 images 224x224 with bounding boxesApply random flips and color jitter, update bounding boxes for flips1000 augmented images 224x224 with updated bounding boxes
Image1 flipped horizontally, cat box coordinates adjusted accordingly
4Train/test split
1000 augmented images with bounding boxesSplit dataset into 800 training and 200 testing images800 training images, 200 testing images with bounding boxes
Training set: 800 images, Testing set: 200 images
5Model input preparation
800 training images 224x224 with bounding boxesConvert bounding boxes and labels into model-specific tensor format800 training samples with image tensors and target tensors
Sample: image tensor shape (3,224,224), target dict with boxes tensor shape (N,4), labels tensor shape (N)
Training Trace - Epoch by Epoch
Loss
2.5 |*****
2.0 |**** 
1.5 |***  
1.0 |**   
0.5 |*    
0.0 +-----
      1 2 3 4 5 Epochs
EpochLoss ↓Accuracy ↑Observation
12.50.15High loss and low accuracy as model starts learning
21.80.35Loss decreases, accuracy improves as model learns object features
31.20.55Model better detects objects, bounding box predictions improve
40.90.70Loss continues to decrease, accuracy rises steadily
50.70.78Model converging, good detection performance
Prediction Trace - 6 Layers
Layer 1: Input image preprocessing
Layer 2: Feature extraction (CNN layers)
Layer 3: Region proposal network
Layer 4: Bounding box regression and classification
Layer 5: Non-maximum suppression
Layer 6: Output prediction
Model Quiz - 3 Questions
Test your understanding
What happens to the bounding boxes during image resizing in preprocessing?
AThey are adjusted to match the new image size
BThey remain the same as original image
CThey are removed and recreated later
DThey are converted to grayscale
Key Insight
This visualization shows how preparing a custom object detection dataset carefully and training a model step-by-step leads to improved detection accuracy and reliable bounding box predictions.