Computer Visionml~10 mins

Action recognition basics in Computer Vision - Interactive Code Practice

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to load video frames for action recognition.

Computer Vision

import cv2

cap = cv2.VideoCapture('video.mp4')
ret, frame = cap.[1]()
if ret:
    print('Frame loaded')
cap.release()

Drag options to blanks, or click blank then click option'

Acapture

Bread

Cload

Dget

Attempts:

3 left

2fill in blank

medium

Complete the code to extract features from frames using a pretrained CNN model.

Computer Vision

from torchvision import models, transforms
import torch

model = models.resnet18(pretrained=True)
model.eval()

preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

input_tensor = preprocess(frame)
input_batch = input_tensor.unsqueeze(0)

with torch.no_grad():
    features = model.[1](input_batch)

Drag options to blanks, or click blank then click option'

Aforward

Bpredict

Ctransform

Dfit

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to correctly compute accuracy for action recognition predictions.

Computer Vision

correct = 0
for pred, label in zip(predictions, labels):
    if pred == [1]:
        correct += 1
accuracy = correct / len(labels)
print(f'Accuracy: {accuracy:.2f}')

Drag options to blanks, or click blank then click option'

Apred

Bpredictions

Clabels

Dlabel

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary of frame indices and their corresponding action labels.

Computer Vision

frame_labels = {i: [1] for i, [2] in enumerate(predicted_actions)}

Drag options to blanks, or click blank then click option'

Alabel

Baction

Cpredicted_actions

Dlabel_idx

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to filter frames with confidence above threshold and create a result dictionary.

Computer Vision

result = {frame_id: [1] for frame_id, (label, conf) in frame_data.items() if conf [2] [3]

Drag options to blanks, or click blank then click option'

Alabel

C0.8

Dconf

Attempts:

3 left

Practice

(1/5)

1. What is the main goal of action recognition in computer vision?

easy

A. To generate captions for images

B. To detect objects in images

C. To enhance image resolution

D. To identify human movements in videos

Action recognition basics in Computer Vision - Interactive Code Practice

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of action recognition

Step 2: Compare with other tasks

Final Answer:

Quick Check:

Solution

Step 1: Identify video data format

Step 2: Eliminate incorrect options

Final Answer:

Quick Check:

Solution

Step 1: Understand the loop over frames

Step 2: Count how many features are appended

Final Answer:

Quick Check:

Solution

Step 1: Analyze feature extraction and model input

Step 2: Check other training steps

Final Answer:

Quick Check:

Solution

Step 1: Understand spatial vs temporal features

Step 2: Identify model type capturing motion

Step 3: Evaluate other options

Final Answer:

Quick Check: