Complete the code to import the Mask R-CNN model from the torchvision library.
from torchvision.models.detection import [1]
The function maskrcnn_resnet50_fpn loads the Mask R-CNN model with a ResNet-50 backbone and Feature Pyramid Network.
Complete the code to create a Mask R-CNN model pre-trained on COCO dataset.
model = [1](pretrained=True)
Using maskrcnn_resnet50_fpn(pretrained=True) loads the Mask R-CNN model with weights trained on the COCO dataset.
Fix the error in the code to switch the model to evaluation mode.
model.[1]()train() instead of eval().fit() or predict().Calling model.eval() sets the model to evaluation mode, disabling dropout and batch normalization updates.
Fill both blanks to prepare the input image tensor and move it to the device.
image = image.to([1]).unsqueeze([2])
unsqueeze(1) instead of unsqueeze(0).cpu explicitly when device is GPU.The image tensor is moved to the device (CPU or GPU) with to(device). Then unsqueeze(0) adds a batch dimension at position 0.
Fill all three blanks to run the model on input and get masks from the output.
with torch.no_grad(): output = model([1]) masks = output[0][[2]].squeeze([3]) > 0.5
The model expects a list of images, so input is [image]. The masks are in output[0]['masks']. The squeeze(1) removes the channel dimension before thresholding.