Practice

(1/5)

1. What is the main purpose of hand and face landmark detection in computer vision?

easy

A. To compress video files

B. To increase image resolution

C. To change the color of images

D. To find key points on hands and faces in images or videos

Solution

Step 1: Understand the goal of landmark detection
Landmark detection identifies important points on hands and faces to understand their shape and position.
Step 2: Compare options with the goal
Only To find key points on hands and faces in images or videos matches this goal by describing key point detection on hands and faces.
Final Answer:
To find key points on hands and faces in images or videos -> Option D
Quick Check:
Landmark detection = key points detection [OK]

Hint: Landmark detection means finding important points [OK]

Common Mistakes:

Confusing landmark detection with image enhancement
Thinking it changes image colors
Mixing it up with video compression

2. Which of the following is the correct way to import MediaPipe's hand landmark detection module in Python?

easy

A. import mediapipe.solutions.hands as mp_hands

B. import mediapipe.hands as mp_hands

C. import mediapipe as mp mp.solutions.hands

D. from mediapipe import hands

Solution

Step 1: Recall MediaPipe import syntax
MediaPipe modules are imported from mediapipe.solutions, e.g., mediapipe.solutions.hands.
Step 2: Check each option
import mediapipe.solutions.hands as mp_hands correctly imports mediapipe.solutions.hands as mp_hands. Others are incorrect or incomplete.
Final Answer:
import mediapipe.solutions.hands as mp_hands -> Option A
Quick Check:
Correct import = mediapipe.solutions.hands [OK]

Hint: MediaPipe modules come from mediapipe.solutions [OK]

Common Mistakes:

Using incorrect import paths
Trying to import submodules directly without solutions
Confusing alias names

3. Given the following Python code using MediaPipe for hand landmarks detection, what will be printed?

import mediapipe as mp
mp_hands = mp.solutions.hands
hands = mp_hands.Hands(static_image_mode=True)
results = hands.process(image_rgb)
print(len(results.multi_hand_landmarks))

Assuming image_rgb contains one clear hand.

medium

A. 1

B. Error

C. None

D. 0

Solution

Step 1: Understand the code flow
The code processes an RGB image with one hand using MediaPipe Hands in static mode.
Step 2: Interpret the output
Since one hand is present, results.multi_hand_landmarks will contain one set of landmarks, so its length is 1.
Final Answer:
1 -> Option A
Quick Check:
One hand detected = length 1 [OK]

Hint: Length of landmarks list equals number of detected hands [OK]

Common Mistakes:

Assuming zero when hand is present
Confusing None with empty list
Expecting error without checking input

4. You wrote this code to detect face landmarks but get an error:

import mediapipe as mp
mp_face = mp.solutions.face_mesh
face_mesh = mp_face.FaceMesh()
results = face_mesh.process(image_bgr)
print(results.multi_face_landmarks)

What is the likely cause of the error?

medium

A. Missing import for cv2

B. FaceMesh class does not exist

C. Input image should be RGB, not BGR

D. process() method requires grayscale image

Solution

Step 1: Check input image format for MediaPipe FaceMesh
MediaPipe expects RGB images, but the code uses image_bgr (BGR format).
Step 2: Understand error cause
Using BGR instead of RGB causes wrong color channels and likely errors in detection.
Final Answer:
Input image should be RGB, not BGR -> Option C
Quick Check:
MediaPipe needs RGB input images [OK]

Hint: Always convert BGR to RGB before MediaPipe processing [OK]

Common Mistakes:

Passing BGR images directly
Assuming FaceMesh class is missing
Thinking grayscale is required

5. You want to build a gesture recognition app using hand landmarks. Which approach best improves accuracy when hands are rotated or partially hidden?

hard

A. Only train on perfectly centered and clear hand images

B. Use data augmentation with rotated and occluded hand images during training

C. Ignore landmarks and use raw images directly

D. Use grayscale images instead of color

Solution

Step 1: Understand challenges in gesture recognition
Hands can appear rotated or partly hidden, so model must handle variations.
Step 2: Choose best method to improve robustness
Data augmentation with rotated and occluded images teaches model to recognize gestures despite changes.
Final Answer:
Use data augmentation with rotated and occluded hand images during training -> Option B
Quick Check:
Augmentation improves model robustness [OK]

Hint: Augment training data to handle rotations and occlusions [OK]

Common Mistakes:

Training only on perfect images
Ignoring landmarks reduces accuracy
Using grayscale loses important info

Epoch	Loss ↓	Accuracy ↑	Observation
1	0.12	0.65	Model starts learning basic landmark positions
2	0.08	0.75	Loss decreases as model improves landmark precision
3	0.05	0.82	Model captures hand and face shapes better
4	0.035	0.88	Fine details like finger joints detected more accurately
5	0.025	0.91	Training converges with stable low loss and high accuracy

Hand and face landmark detection in Computer Vision - Model Pipeline Trace

Start learning this pattern below

Practice

Solution

Step 1: Understand the goal of landmark detection

Step 2: Compare options with the goal

Final Answer:

Quick Check:

Solution

Step 1: Recall MediaPipe import syntax

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Understand the code flow

Step 2: Interpret the output

Final Answer:

Quick Check:

Solution

Step 1: Check input image format for MediaPipe FaceMesh

Step 2: Understand error cause

Final Answer:

Quick Check:

Solution

Step 1: Understand challenges in gesture recognition

Step 2: Choose best method to improve robustness

Final Answer:

Quick Check: