Complete the code to load a pre-trained Faster R-CNN model from torchvision.
import torchvision.models.detection as detection model = detection.[1](pretrained=True)
The function fasterrcnn_resnet50_fpn loads a Faster R-CNN model with a ResNet-50 backbone and Feature Pyramid Network, commonly used for object detection.
Complete the code to put the model in evaluation mode before inference.
model = detection.fasterrcnn_resnet50_fpn(pretrained=True) model.[1]()
train() mode during inference, which can cause inconsistent results.predict() which is not a PyTorch model method.Calling eval() on the model sets it to evaluation mode, disabling dropout and batch normalization updates, which is important for inference.
Fix the error in the code to move the model to the GPU device if available.
import torch model = detection.fasterrcnn_resnet50_fpn(pretrained=True) device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = model.[1](device)
cuda() directly, which fails if CUDA is not available.device() or move().The to() method moves the model to the specified device, which can be CPU or GPU.
Fill in the blank to perform inference with the model.
import torch from PIL import Image from torchvision import transforms image = Image.open('image.jpg').convert('RGB') transform = transforms.Compose([ transforms.ToTensor(), ]) input_tensor = transform(image) model.eval() with torch.no_grad(): outputs = model([1])
input_tensor directly instead of a list.unsqueeze(), which is unnecessary and incorrect for these models.The model expects a list of tensors as input (even for a single image), so we pass [input_tensor] to it. Each tensor should be of shape (C, H, W).
Fill all three blanks to filter detection results by confidence score and print detected labels.
labels = outputs[0]['labels'] scores = outputs[0]['scores'] threshold = 0.8 filtered_indices = [i for i, score in enumerate(scores) if score [1] threshold] filtered_labels = [labels[i].item() for i in filtered_indices] print('Detected labels:', [2]) # Convert label IDs to class names COCO_INSTANCE_CATEGORY_NAMES = [ '__background__', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush' ] filtered_names = [COCO_INSTANCE_CATEGORY_NAMES[label] for label in filtered_labels] print('Detected class names:', [3])
We filter scores with >= threshold to include confident detections. We print the filtered label IDs first, then convert them to class names and print those.