Computer Visionml~12 mins

FCN (Fully Convolutional Network) in Computer Vision - Model Pipeline Trace

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Model Pipeline - FCN (Fully Convolutional Network)

An FCN is a special type of neural network that looks at images and labels each pixel. It helps computers understand what parts of a picture belong to which object, like coloring each pixel as 'car' or 'road'.

Data Flow - 7 Stages

1Input Image

1 image x 256 height x 256 width x 3 channels→Raw image loaded for processing→1 image x 256 height x 256 width x 3 channels

A 256x256 color photo with red, green, blue channels

↓

2Convolutional Layers

1 x 256 x 256 x 3→Apply filters to detect edges and shapes→1 x 256 x 256 x 64

Feature maps highlighting edges and textures

↓

3Pooling Layers

1 x 256 x 256 x 64→Reduce image size to focus on important features→1 x 128 x 128 x 64

Smaller feature maps summarizing regions

↓

4More Convolutional Layers

1 x 128 x 128 x 64→Extract deeper features→1 x 128 x 128 x 128

Feature maps capturing complex shapes

↓

5Upsampling Layers

1 x 128 x 128 x 128→Increase spatial size to original image size→1 x 256 x 256 x 128

Upscaled feature maps aligned with input size

↓

6Final Convolution

1 x 256 x 256 x 128→Produce pixel-wise class scores→1 x 256 x 256 x 3

Scores for 3 classes per pixel (e.g., background, object1, object2)

↓

7Softmax Activation

1 x 256 x 256 x 3→Convert scores to probabilities per pixel→1 x 256 x 256 x 3

Probabilities summing to 1 for each pixel

Training Trace - Epoch by Epoch

Loss
1.2 |****
0.9 |***
0.7 |**
0.55|*
0.45| 
     Epochs -> 1 2 3 4 5

Epoch	Loss ↓	Accuracy ↑	Observation
1	1.2	0.45	Model starts learning basic features
2	0.9	0.60	Improved pixel classification
3	0.7	0.72	Model captures more details
4	0.55	0.80	Better segmentation boundaries
5	0.45	0.85	Model converging well

Prediction Trace - 7 Layers

Layer 1: Input Image

Layer 2: Convolutional Layer

Layer 3: Pooling Layer

Layer 4: More Convolutional Layers

Layer 5: Upsampling Layer

Layer 6: Final Convolution

Layer 7: Softmax Activation

Model Quiz - 3 Questions

Test your understanding

What is the main purpose of the upsampling layers in an FCN?

ATo apply activation functions like ReLU

BTo reduce the number of channels in the feature maps

CTo increase the spatial size of feature maps back to the input image size

DTo split the image into smaller patches

Key Insight

Fully Convolutional Networks learn to label each pixel by combining convolution, pooling, and upsampling layers. Training improves the model by lowering loss and raising accuracy, helping it understand image details better.