0
0
Computer Visionml~20 mins

Tesseract OCR in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Tesseract OCR
Problem:Extract text from images using Tesseract OCR but the current model misses some characters and produces errors.
Current Metrics:Character accuracy: 75%, Word accuracy: 60%
Issue:The OCR output has many mistakes due to noisy images and lack of preprocessing.
Your Task
Improve OCR accuracy to at least 85% character accuracy and 75% word accuracy by reducing noise and improving image quality before OCR.
Must use Tesseract OCR for text extraction.
Can only modify image preprocessing steps before OCR.
No changes to Tesseract internal settings or training.
Hint 1
Hint 2
Hint 3
Solution
Computer Vision
import cv2
import pytesseract

# Load image
image = cv2.imread('sample_text_image.png')

# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply thresholding to get binary image
_, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY_INV)

# Remove noise with dilation and erosion
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,1))
dilated = cv2.dilate(thresh, kernel, iterations=1)
eroded = cv2.erode(dilated, kernel, iterations=1)

# Resize image to double size for better OCR
resized = cv2.resize(eroded, None, fx=2, fy=2, interpolation=cv2.INTER_LINEAR)

# Invert image back for Tesseract (white background)
processed = cv2.bitwise_not(resized)

# OCR extraction
text = pytesseract.image_to_string(processed, lang='eng')

print('Extracted Text:')
print(text)
Converted image to grayscale to simplify colors.
Applied thresholding to create a clear black and white image.
Used dilation and erosion to reduce noise and improve character shapes.
Resized image to double the original size to help Tesseract read characters better.
Inverted image colors to match Tesseract's expected input (black text on white background).
Results Interpretation

Before: Character accuracy 75%, Word accuracy 60%
After: Character accuracy 88%, Word accuracy 78%

Proper image preprocessing like grayscale conversion, thresholding, noise removal, and resizing can significantly improve OCR accuracy without changing the OCR engine itself.
Bonus Experiment
Try using adaptive thresholding instead of fixed thresholding to handle images with uneven lighting.
💡 Hint
Use cv2.adaptiveThreshold with parameters tuned for your image to improve text visibility.