Computer Visionml~12 mins

DNN-based face detection in Computer Vision - Model Pipeline Trace

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Model Pipeline - DNN-based face detection

This pipeline uses a Deep Neural Network (DNN) to find faces in images. It takes an image, processes it step-by-step, and outputs boxes showing where faces are.

Data Flow - 6 Stages

1Input Image

1 image x 480 height x 640 width x 3 channels→Load and resize image to fixed size→1 image x 300 height x 300 width x 3 channels

A photo of a person with background

↓

2Preprocessing

1 image x 300 x 300 x 3→Normalize pixel values to range 0-1→1 image x 300 x 300 x 3

Pixel values changed from 0-255 to 0.0-1.0

↓

3Feature Extraction

1 image x 300 x 300 x 3→Pass through convolutional layers to get features→1 image x 38 x 38 x 512

Feature map highlighting edges and textures

↓

4Face Proposal

1 image x 38 x 38 x 512→Detect possible face regions with bounding boxes→1 image x 8732 boxes x 4 coordinates

Boxes like [x_min, y_min, x_max, y_max] for face candidates

↓

5Classification & Refinement

1 image x 8732 boxes x 4→Classify boxes as face or background and adjust box size→1 image x N boxes x 4 coordinates (N < 8732)

Filtered boxes with confidence scores > 0.5

↓

6Non-Maximum Suppression (NMS)

1 image x N boxes x 4→Remove overlapping boxes to keep best face boxes→1 image x M boxes x 4 coordinates (M < N)

Final boxes around faces with little overlap

Training Trace - Epoch by Epoch


Epoch 1: ************ (1.2)
Epoch 2: ******** (0.9)
Epoch 3: ****** (0.7)
Epoch 4: **** (0.5)
Epoch 5: ** (0.35)

Epoch	Loss ↓	Accuracy ↑	Observation
1	1.2	0.60	Model starts learning, loss high, accuracy low
2	0.9	0.72	Loss decreases, accuracy improves
3	0.7	0.80	Model learns better face features
4	0.5	0.87	Loss continues to drop, accuracy rises
5	0.35	0.92	Good convergence, model detects faces well

Prediction Trace - 6 Layers

Layer 1: Input Image

Layer 2: Preprocessing

Layer 3: Convolutional Layers

Layer 4: Face Proposal Layer

Layer 5: Classification & Box Refinement

Layer 6: Non-Maximum Suppression

Model Quiz - 3 Questions

Test your understanding

What is the purpose of the Non-Maximum Suppression step?

ATo normalize pixel values between 0 and 1

BTo resize the input image to a fixed size

CTo remove overlapping face boxes and keep the best ones

DTo extract features like edges and textures

Key Insight

This visualization shows how a deep neural network processes an image step-by-step to detect faces. The model learns to extract useful features, propose many face boxes, and then filters them to find the best face locations. Training improves the model by reducing loss and increasing accuracy, making face detection more reliable.