0
0
Computer Visionml~20 mins

R-CNN family overview in Computer Vision - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
R-CNN Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding the main difference between R-CNN and Fast R-CNN

What is the key difference between the original R-CNN and Fast R-CNN in how they process images for object detection?

AR-CNN uses a single CNN for the whole image, while Fast R-CNN uses multiple CNNs for each region proposal.
BFast R-CNN does not use region proposals, unlike R-CNN which relies on them.
CFast R-CNN processes the entire image with a single CNN forward pass, while R-CNN processes each region proposal separately.
DR-CNN uses a fully convolutional network, while Fast R-CNN uses a fully connected network.
Attempts:
2 left
💡 Hint

Think about how many times the CNN runs on the image in each method.

Model Choice
intermediate
2:00remaining
Choosing the right R-CNN variant for real-time detection

You want to build a real-time object detection system on a mobile device with limited computing power. Which R-CNN family model is the best choice?

AFaster R-CNN
BFast R-CNN
CR-CNN
DMask R-CNN
Attempts:
2 left
💡 Hint

Consider which model introduced a faster region proposal method.

Predict Output
advanced
3:00remaining
Output shape of feature map in Faster R-CNN backbone

Given an input image of size 224x224x3 passed through a backbone CNN that reduces spatial dimensions by a factor of 16, what is the shape of the output feature map?

Computer Vision
import torch
import torch.nn as nn

input_tensor = torch.randn(1, 3, 224, 224)

class SimpleBackbone(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 256, kernel_size=3, stride=2, padding=1)
        self.pool = nn.MaxPool2d(2)
    def forward(self, x):
        x = self.conv(x)  # halves size
        x = self.pool(x)  # halves size again
        x = self.pool(x)  # halves size again
        return x

model = SimpleBackbone()
output = model(input_tensor)
output.shape
Atorch.Size([1, 256, 56, 56])
Btorch.Size([1, 256, 7, 7])
Ctorch.Size([1, 256, 14, 14])
Dtorch.Size([1, 256, 28, 28])
Attempts:
2 left
💡 Hint

Each stride 2 or pooling halves the spatial size.

Metrics
advanced
2:00remaining
Evaluating Mask R-CNN segmentation output

Mask R-CNN outputs a mask for each detected object. Which metric best measures how well the predicted mask matches the true mask?

AIntersection over Union (IoU)
BAccuracy
CMean Squared Error (MSE)
DPrecision
Attempts:
2 left
💡 Hint

Think about overlap between predicted and true masks.

🔧 Debug
expert
3:00remaining
Debugging slow training in Faster R-CNN

You notice that training Faster R-CNN on your dataset is very slow. You suspect the bottleneck is in the region proposal network (RPN). Which of the following changes will most likely speed up training without hurting accuracy?

AIncrease the number of anchors per location in the RPN.
BReduce the input image size to a smaller resolution.
CRemove the RPN and use external region proposals.
DIncrease the number of training epochs.
Attempts:
2 left
💡 Hint

Think about how input size affects computation.