CNNs are great at finding patterns in pictures. They look at small parts of images and combine what they learn to understand the whole picture.
0
0
Why CNNs dominate image classification in Computer Vision
Introduction
When you want a computer to recognize objects in photos, like cats or cars.
When sorting pictures into groups, such as different types of flowers.
When detecting faces or expressions in images for apps.
When improving photo search by understanding image content.
When building apps that need to read handwritten numbers or letters.
Syntax
Computer Vision
import torch import torch.nn as nn import torch.nn.functional as F class SimpleCNN(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3) self.pool = nn.MaxPool2d(2, 2) self.fc1 = nn.Linear(16 * 6 * 6, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = torch.flatten(x, 1) x = self.fc1(x) return x
CNNs use layers called convolutional layers to scan images piece by piece.
Pooling layers help reduce image size while keeping important info.
Examples
This creates a convolutional layer for grayscale images with 8 filters.
Computer Vision
conv_layer = nn.Conv2d(in_channels=1, out_channels=8, kernel_size=3)
This layer shrinks the image by taking the biggest value in each 2x2 area.
Computer Vision
pool_layer = nn.MaxPool2d(kernel_size=2, stride=2)
This is a fully connected layer that outputs 10 class scores.
Computer Vision
fc_layer = nn.Linear(in_features=128, out_features=10)
Sample Model
This code builds a simple CNN that looks at 28x28 grayscale images and outputs scores for 2 classes. It shows how CNN layers work together.
Computer Vision
import torch import torch.nn as nn import torch.nn.functional as F class SimpleCNN(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(1, 4, 3) # 1 input channel (grayscale), 4 filters self.pool = nn.MaxPool2d(2, 2) self.fc1 = nn.Linear(4 * 13 * 13, 2) # assuming input 28x28 def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = torch.flatten(x, 1) x = self.fc1(x) return x # Create a dummy batch of 2 grayscale images 28x28 inputs = torch.randn(2, 1, 28, 28) model = SimpleCNN() outputs = model(inputs) print("Output shape:", outputs.shape) print("Output values:", outputs)
OutputSuccess
Important Notes
CNNs automatically learn important features from images without manual work.
They reduce the number of parameters compared to regular neural networks, making training easier.
Pooling layers help CNNs focus on important parts and ignore small shifts in images.
Summary
CNNs scan images in small parts to find patterns.
Pooling helps shrink images while keeping key info.
This makes CNNs very good and popular for image tasks.