0
0
Computer Visionml~12 mins

DNN-based face detection in Computer Vision - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - DNN-based face detection

This pipeline uses a Deep Neural Network (DNN) to find faces in images. It takes an image, processes it step-by-step, and outputs boxes showing where faces are.

Data Flow - 6 Stages
1Input Image
1 image x 480 height x 640 width x 3 channelsLoad and resize image to fixed size1 image x 300 height x 300 width x 3 channels
A photo of a person with background
2Preprocessing
1 image x 300 x 300 x 3Normalize pixel values to range 0-11 image x 300 x 300 x 3
Pixel values changed from 0-255 to 0.0-1.0
3Feature Extraction
1 image x 300 x 300 x 3Pass through convolutional layers to get features1 image x 38 x 38 x 512
Feature map highlighting edges and textures
4Face Proposal
1 image x 38 x 38 x 512Detect possible face regions with bounding boxes1 image x 8732 boxes x 4 coordinates
Boxes like [x_min, y_min, x_max, y_max] for face candidates
5Classification & Refinement
1 image x 8732 boxes x 4Classify boxes as face or background and adjust box size1 image x N boxes x 4 coordinates (N < 8732)
Filtered boxes with confidence scores > 0.5
6Non-Maximum Suppression (NMS)
1 image x N boxes x 4Remove overlapping boxes to keep best face boxes1 image x M boxes x 4 coordinates (M < N)
Final boxes around faces with little overlap
Training Trace - Epoch by Epoch

Epoch 1: ************ (1.2)
Epoch 2: ******** (0.9)
Epoch 3: ****** (0.7)
Epoch 4: **** (0.5)
Epoch 5: ** (0.35)
EpochLoss ↓Accuracy ↑Observation
11.20.60Model starts learning, loss high, accuracy low
20.90.72Loss decreases, accuracy improves
30.70.80Model learns better face features
40.50.87Loss continues to drop, accuracy rises
50.350.92Good convergence, model detects faces well
Prediction Trace - 6 Layers
Layer 1: Input Image
Layer 2: Preprocessing
Layer 3: Convolutional Layers
Layer 4: Face Proposal Layer
Layer 5: Classification & Box Refinement
Layer 6: Non-Maximum Suppression
Model Quiz - 3 Questions
Test your understanding
What is the purpose of the Non-Maximum Suppression step?
ATo normalize pixel values between 0 and 1
BTo resize the input image to a fixed size
CTo remove overlapping face boxes and keep the best ones
DTo extract features like edges and textures
Key Insight
This visualization shows how a deep neural network processes an image step-by-step to detect faces. The model learns to extract useful features, propose many face boxes, and then filters them to find the best face locations. Training improves the model by reducing loss and increasing accuracy, making face detection more reliable.