0
0
Computer Visionml~8 mins

Table extraction from images in Computer Vision - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Table extraction from images
Which metric matters for Table extraction from images and WHY

For table extraction, we want to measure how well the model finds the correct table cells and their content. Key metrics include Precision and Recall. Precision tells us how many detected table cells are actually correct, avoiding false detections. Recall tells us how many real table cells the model found, avoiding missed cells. The F1 score balances these two. High precision means clean, accurate tables; high recall means complete tables. We also use Intersection over Union (IoU) to check how well the predicted cell boxes overlap with the true boxes.

Confusion matrix for Table extraction
True Positives (TP): Correctly detected table cells
False Positives (FP): Detected cells that are not real
False Negatives (FN): Real cells missed by the model

Example confusion matrix counts:
+----------------+----------------+
|                | Predicted Cell |
|                | Yes      No    |
+----------------+----------------+
| Actual Cell Yes| TP = 80  FN = 20|
| Actual Cell No | FP = 10  TN = 90|
+----------------+----------------+

Total cells = TP + FP + FN + TN = 80 + 10 + 20 + 90 = 200

Precision = TP / (TP + FP) = 80 / (80 + 10) = 0.89
Recall = TP / (TP + FN) = 80 / (80 + 20) = 0.80
F1 Score = 2 * (0.89 * 0.80) / (0.89 + 0.80) ≈ 0.84
    
Precision vs Recall tradeoff with examples

If the model has high precision but low recall, it means it finds mostly correct table cells but misses many real ones. This leads to incomplete tables, which can be bad if you need full data.

If the model has high recall but low precision, it finds most real cells but also includes many wrong ones. This creates noisy tables with errors.

For example, in financial reports, missing a table cell (low recall) can lose important data. But including wrong cells (low precision) can cause wrong calculations. So a balance (high F1) is best.

What good vs bad metric values look like for Table extraction
  • Good: Precision and Recall above 0.85, F1 score above 0.85, IoU above 0.75 for cell bounding boxes. This means most cells are correctly found and well localized.
  • Bad: Precision or Recall below 0.5 means many errors or misses. F1 below 0.6 means poor balance. IoU below 0.5 means boxes do not match well, causing wrong cell boundaries.
Common pitfalls in Table extraction metrics
  • Accuracy paradox: If most images have no tables, a model that always predicts no table can have high accuracy but is useless.
  • Data leakage: Using the same documents for training and testing inflates metrics falsely.
  • Overfitting: Very high training metrics but low test metrics means the model memorizes tables instead of generalizing.
  • Ignoring IoU: Counting a detected cell as correct without checking overlap can overestimate performance.
Self-check question

Your table extraction model has 98% accuracy but only 12% recall on table cells. Is it good for production? Why or why not?

Answer: No, it is not good. The high accuracy is misleading because most images or areas may not have tables, so predicting no table often is easy. The very low recall means the model misses almost all real table cells, so it fails to extract useful tables. For production, you need much higher recall to capture the tables fully.

Key Result
Precision and recall are key to measure correct and complete table cell detection; balance them for best extraction quality.