What is the main advantage of using a pre-trained object detection model instead of training one from scratch?
Think about the resources needed to train a model from zero versus using a model already trained on a large dataset.
Pre-trained models have already learned useful features from large datasets, so they need less data and time to adapt to new tasks.
What is the output of this Python code snippet using a pre-trained detection model from torchvision?
import torch from torchvision.models.detection import fasterrcnn_resnet50_fpn model = fasterrcnn_resnet50_fpn(pretrained=True) model.eval() # Dummy input: batch of 1 image with 3 channels, 224x224 pixels input_tensor = torch.randn(1, 3, 224, 224) outputs = model(input_tensor) print(type(outputs))
Check the documentation for torchvision detection models' output format.
Pre-trained detection models in torchvision return a list of dictionaries, one per image in the batch.
You want to deploy an object detection model on a mobile device with limited computing power and need real-time performance. Which pre-trained model is the best choice?
Consider model size and speed for mobile deployment.
YOLOv5 Nano is designed for fast inference on low-resource devices, unlike heavier models like Faster R-CNN or Mask R-CNN.
When using a pre-trained detection model, what is the effect of increasing the confidence threshold parameter during inference?
Think about filtering detections based on how sure the model is.
Increasing the confidence threshold filters out low-confidence detections, reducing false positives but possibly missing some true objects.
You evaluate a pre-trained object detection model on a test set and get these results: precision = 0.75, recall = 0.60. What does this tell you about the model's detections?
Recall measures how many true objects are found; precision measures how many detections are correct.
Precision of 0.75 means 75% of detections are correct; recall of 0.60 means 40% of true objects are missed.