You want to detect very small objects in images using torchvision detection models. Which model is generally best suited for this task?
Think about models that use multi-scale features to detect objects of different sizes.
Faster R-CNN with FPN uses multi-scale feature maps, which helps detect small objects better than single-scale models like SSD or RetinaNet without FPN. Mask R-CNN without a backbone is invalid.
Given a batch of 2 images passed through a pretrained Faster R-CNN model from torchvision, what is the type and length of the output?
import torch import torchvision model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True) model.eval() inputs = [torch.randn(3, 300, 400), torch.randn(3, 500, 600)] outputs = model(inputs) print(type(outputs), len(outputs))
Check the model's documentation for output format when passing a list of images.
torchvision detection models return a list of dictionaries, one per input image. So the output is a list with length equal to batch size.
In torchvision's Faster R-CNN, what is the effect of increasing the IoU threshold used in the Non-Maximum Suppression (NMS) step during inference?
Think about what happens when the threshold for overlap to suppress boxes is higher.
Increasing the IoU threshold means boxes must overlap more to be suppressed, so more boxes remain, which can cause duplicates.
You get the error: 'TypeError: forward() missing 1 required positional argument: "targets"' when training Mask R-CNN from torchvision. What is the most likely cause?
Check the model's forward method signature for training mode.
During training, torchvision detection models require both images and targets. Missing targets causes this error.
You evaluate a torchvision Faster R-CNN model on a test set and get a mean Average Precision (mAP) of 0.75 at IoU=0.5. What does this number mean?
Recall what mAP and IoU thresholds represent in object detection evaluation.
mAP at IoU=0.5 means the average precision of detections that overlap ground truth by at least 50%. A value of 0.75 means good detection quality.