0
0
Computer Visionml~12 mins

Face detection with deep learning in Computer Vision - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Face detection with deep learning

This pipeline detects faces in images using a deep learning model. It takes an image, processes it to find face features, trains a model to recognize faces, and then predicts face locations in new images.

Data Flow - 6 Stages
1Input Image
1 image x 640 x 480 x 3 channelsLoad and resize image to fixed size1 image x 224 x 224 x 3 channels
A photo of a person resized to 224x224 pixels with RGB colors
2Preprocessing
1 image x 224 x 224 x 3 channelsNormalize pixel values to range 0-11 image x 224 x 224 x 3 channels
Pixel values changed from 0-255 to 0.0-1.0
3Feature Extraction
1 image x 224 x 224 x 3 channelsPass image through convolutional layers to extract features1 image x 14 x 14 x 256 feature maps
Edges and shapes detected in the image
4Face Region Proposal
1 image x 14 x 14 x 256 feature mapsGenerate candidate face boxes using region proposal network1 image x 300 candidate boxes x 4 coordinates
300 boxes with coordinates like [x_min, y_min, x_max, y_max]
5Classification and Bounding Box Regression
1 image x 300 candidate boxes x featuresClassify each box as face or background and refine box coordinates1 image x 300 boxes with class scores and refined coordinates
Box 1: face score 0.95, coordinates refined to better fit face
6Non-Maximum Suppression (NMS)
1 image x 300 boxes with scoresRemove overlapping boxes to keep best face detections1 image x 5 final face boxes
5 boxes left with highest confidence and no large overlap
Training Trace - Epoch by Epoch

Epochs
1 |************
2 |**************
3 |****************
4 |********************
5 |**********************
Loss
1.2 0.9 0.7 0.5 0.4
Accuracy
0.60 0.72 0.80 0.87 0.91
EpochLoss ↓Accuracy ↑Observation
11.20.60Model starts learning basic face features
20.90.72Loss decreases as model improves face detection
30.70.80Model learns better bounding box predictions
40.50.87Face classification accuracy improves
50.40.91Model converges with high accuracy and low loss
Prediction Trace - 6 Layers
Layer 1: Input Image
Layer 2: Preprocessing
Layer 3: Feature Extraction (Conv Layers)
Layer 4: Region Proposal Network
Layer 5: Classification and Box Refinement
Layer 6: Non-Maximum Suppression
Model Quiz - 3 Questions
Test your understanding
What is the purpose of the Non-Maximum Suppression step?
ATo resize the input image to a fixed size
BTo normalize pixel values between 0 and 1
CTo remove overlapping boxes and keep the best face detections
DTo extract features from the image using convolution
Key Insight
This visualization shows how a deep learning model learns to detect faces by extracting features, proposing candidate face regions, and refining predictions. Training improves accuracy while reducing loss, and post-processing like Non-Maximum Suppression helps produce clear final face detections.