0
0
Computer Visionml~8 mins

Tesseract OCR in Computer Vision - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Tesseract OCR
Which metric matters for Tesseract OCR and WHY

Tesseract OCR turns images of text into actual text. The main goal is to get the text exactly right. So, Character Error Rate (CER) and Word Error Rate (WER) are the key metrics. They measure how many characters or words are wrong compared to the true text.

Lower CER and WER mean better OCR quality. Accuracy is also used, but CER and WER give a clearer picture of mistakes in text recognition.

Confusion matrix or equivalent visualization

For OCR, a confusion matrix can show how often one character is mistaken for another. For example:

      Actual \ Predicted |  a  |  o  |  e  |  l  |  i  
      ---------------------------------------------
                 a      | 90  |  5  |  2  |  1  |  2  
                 o      |  3  | 85  |  5  |  4  |  3  
                 e      |  2  |  4  | 88  |  3  |  3  
                 l      |  1  |  2  |  3  | 90  |  4  
                 i      |  2  |  3  |  3  |  5  | 87  
    

This shows how often Tesseract confuses letters. The diagonal numbers are correct predictions (True Positives for each character).

Precision vs Recall tradeoff with examples

In OCR, precision means how many recognized characters are actually correct. Recall means how many true characters were found by the OCR.

If precision is high but recall is low, the OCR is very sure about the characters it outputs but misses many characters (like skipping hard-to-read words).

If recall is high but precision is low, the OCR tries to read everything but makes many mistakes.

Example: For reading handwritten notes, high recall is important to capture all words, even if some are wrong. For legal documents, high precision is critical to avoid errors.

What "good" vs "bad" metric values look like for Tesseract OCR

Good OCR:

  • Character Error Rate (CER) below 5%
  • Word Error Rate (WER) below 10%
  • High precision and recall above 90%

Bad OCR:

  • CER above 20%
  • WER above 30%
  • Low precision or recall below 70%
  • Many confused characters or missing words
Common pitfalls in OCR metrics
  • Accuracy paradox: High accuracy can be misleading if most text is easy and errors are rare but critical.
  • Ignoring context: Metrics may not capture if recognized text makes sense.
  • Data leakage: Testing on images similar to training can inflate scores.
  • Overfitting: OCR tuned too much on one font or style may fail on others.
  • Ignoring layout: OCR may get characters right but fail to preserve reading order.
Self-check question

Your OCR model has 98% accuracy but a 12% recall on rare handwritten words. Is it good for production? Why or why not?

Answer: No, it is not good. Even though overall accuracy is high, the low recall on handwritten words means many words are missed. This can cause important information loss, especially if handwritten text is critical.

Key Result
Character Error Rate (CER) and Word Error Rate (WER) are key metrics to measure Tesseract OCR quality, focusing on text correctness rather than just accuracy.