0
0
Computer Visionml~10 mins

Action recognition basics in Computer Vision - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to load video frames for action recognition.

Computer Vision
import cv2

cap = cv2.VideoCapture('video.mp4')
ret, frame = cap.[1]()
if ret:
    print('Frame loaded')
cap.release()
Drag options to blanks, or click blank then click option'
Acapture
Bread
Cload
Dget
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'get' instead of 'read' causes an error.
Using 'load' is not a valid method.
2fill in blank
medium

Complete the code to extract features from frames using a pretrained CNN model.

Computer Vision
from torchvision import models, transforms
import torch

model = models.resnet18(pretrained=True)
model.eval()

preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

input_tensor = preprocess(frame)
input_batch = input_tensor.unsqueeze(0)

with torch.no_grad():
    features = model.[1](input_batch)
Drag options to blanks, or click blank then click option'
Aforward
Bpredict
Ctransform
Dfit
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'predict' causes attribute error.
Using 'fit' is for training, not inference.
3fill in blank
hard

Fix the error in the code to correctly compute accuracy for action recognition predictions.

Computer Vision
correct = 0
for pred, label in zip(predictions, labels):
    if pred == [1]:
        correct += 1
accuracy = correct / len(labels)
print(f'Accuracy: {accuracy:.2f}')
Drag options to blanks, or click blank then click option'
Apred
Bpredictions
Clabels
Dlabel
Attempts:
3 left
💡 Hint
Common Mistakes
Comparing prediction to predictions list causes wrong results.
Comparing to 'pred' always true, inflating accuracy.
4fill in blank
hard

Fill both blanks to create a dictionary of frame indices and their corresponding action labels.

Computer Vision
frame_labels = {i: [1] for i, [2] in enumerate(predicted_actions)}
Drag options to blanks, or click blank then click option'
Alabel
Baction
Cpredicted_actions
Dlabel_idx
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'predicted_actions' as loop variable causes error.
Using 'label_idx' is not defined in loop.
5fill in blank
hard

Fill all three blanks to filter frames with confidence above threshold and create a result dictionary.

Computer Vision
result = {frame_id: [1] for frame_id, (label, conf) in frame_data.items() if conf [2] [3]
Drag options to blanks, or click blank then click option'
Alabel
B>
C0.8
Dconf
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'conf' as value instead of label.
Using '<' instead of '>' changes filtering logic.