0
0
Computer Visionml~20 mins

Text detection in images in Computer Vision - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Text Detection Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Understanding Text Detection Models

Which of the following best describes the main goal of a text detection model in images?

AClassify the type of font used in the text regions.
BIdentify and locate regions in the image that contain text.
CTranslate the detected text into another language.
DEnhance the image quality to make text clearer.
Attempts:
2 left
💡 Hint

Think about what 'detection' means in the context of images.

Predict Output
intermediate
2:00remaining
Output of Text Region Coordinates Extraction

What is the output of the following Python code snippet using OpenCV's EAST text detector after processing an image?

Computer Vision
import cv2
import numpy as np

# Assume 'image' is a loaded image
net = cv2.dnn.readNet('frozen_east_text_detection.pb')
blob = cv2.dnn.blobFromImage(image, 1.0, (320, 320), (123.68, 116.78, 103.94), True, False)
net.setInput(blob)
scores, geometry = net.forward(['feature_fusion/Conv_7/Sigmoid', 'feature_fusion/concat_3'])

# Process scores and geometry to get boxes
conf_threshold = 0.5
boxes = []
for y in range(scores.shape[2]):
    for x in range(scores.shape[3]):
        score = scores[0, 0, y, x]
        if score < conf_threshold:
            continue
        offsetX, offsetY = x * 4.0, y * 4.0
        angle = geometry[0, 4, y, x]
        cos = np.cos(angle)
        sin = np.sin(angle)
        h = geometry[0, 0, y, x] + geometry[0, 2, y, x]
        w = geometry[0, 1, y, x] + geometry[0, 3, y, x]
        endX = int(offsetX + (cos * geometry[0, 1, y, x]) + (sin * geometry[0, 2, y, x]))
        endY = int(offsetY - (sin * geometry[0, 1, y, x]) + (cos * geometry[0, 2, y, x]))
        startX = int(endX - w)
        startY = int(endY - h)
        boxes.append((startX, startY, endX, endY))

print(len(boxes))
AA float value representing the average confidence score of all detections.
BA list of strings containing the detected text content.
CA 2D array of pixel intensities of the input image.
DAn integer representing the number of detected text boxes with confidence above 0.5.
Attempts:
2 left
💡 Hint

Look at what is appended to boxes and what is printed.

Model Choice
advanced
2:00remaining
Choosing a Model Architecture for Text Detection

You want to detect text in natural scene images with varying fonts and orientations. Which model architecture is most suitable?

AA simple feedforward neural network that classifies image patches as text or non-text.
BA Generative Adversarial Network (GAN) trained to generate synthetic text images.
CA Convolutional Neural Network (CNN) based EAST detector that outputs rotated bounding boxes.
DA Recurrent Neural Network (RNN) designed for sequence prediction on text strings.
Attempts:
2 left
💡 Hint

Consider which model can handle spatial features and rotations for detection.

Metrics
advanced
1:30remaining
Evaluating Text Detection Performance

Which metric is most appropriate to evaluate the quality of text detection bounding boxes compared to ground truth boxes?

AIntersection over Union (IoU) between predicted and ground truth boxes.
BMean Squared Error (MSE) between pixel intensities inside detected boxes.
CAccuracy of character recognition inside detected text regions.
DPerplexity score of the recognized text sequences.
Attempts:
2 left
💡 Hint

Think about how to measure overlap between predicted and actual boxes.

🔧 Debug
expert
2:00remaining
Debugging Text Detection Output

You run a text detection model on an image but get zero detected boxes, even though the image clearly contains text. Which of the following is the most likely cause?

AThe confidence threshold is set too high, filtering out all detections.
BThe model weights file is missing, so the model cannot run.
CThe input image is in grayscale instead of color, causing model failure.
DThe detected boxes are returned but not printed due to a missing print statement.
Attempts:
2 left
💡 Hint

Consider what happens if the threshold for detection confidence is too strict.