For table extraction, we want to measure how well the model finds the correct table cells and their content. Key metrics include Precision and Recall. Precision tells us how many detected table cells are actually correct, avoiding false detections. Recall tells us how many real table cells the model found, avoiding missed cells. The F1 score balances these two. High precision means clean, accurate tables; high recall means complete tables. We also use Intersection over Union (IoU) to check how well the predicted cell boxes overlap with the true boxes.
Table extraction from images in Computer Vision - Model Metrics & Evaluation
True Positives (TP): Correctly detected table cells
False Positives (FP): Detected cells that are not real
False Negatives (FN): Real cells missed by the model
Example confusion matrix counts:
+----------------+----------------+
| | Predicted Cell |
| | Yes No |
+----------------+----------------+
| Actual Cell Yes| TP = 80 FN = 20|
| Actual Cell No | FP = 10 TN = 90|
+----------------+----------------+
Total cells = TP + FP + FN + TN = 80 + 10 + 20 + 90 = 200
Precision = TP / (TP + FP) = 80 / (80 + 10) = 0.89
Recall = TP / (TP + FN) = 80 / (80 + 20) = 0.80
F1 Score = 2 * (0.89 * 0.80) / (0.89 + 0.80) ≈ 0.84
If the model has high precision but low recall, it means it finds mostly correct table cells but misses many real ones. This leads to incomplete tables, which can be bad if you need full data.
If the model has high recall but low precision, it finds most real cells but also includes many wrong ones. This creates noisy tables with errors.
For example, in financial reports, missing a table cell (low recall) can lose important data. But including wrong cells (low precision) can cause wrong calculations. So a balance (high F1) is best.
- Good: Precision and Recall above 0.85, F1 score above 0.85, IoU above 0.75 for cell bounding boxes. This means most cells are correctly found and well localized.
- Bad: Precision or Recall below 0.5 means many errors or misses. F1 below 0.6 means poor balance. IoU below 0.5 means boxes do not match well, causing wrong cell boundaries.
- Accuracy paradox: If most images have no tables, a model that always predicts no table can have high accuracy but is useless.
- Data leakage: Using the same documents for training and testing inflates metrics falsely.
- Overfitting: Very high training metrics but low test metrics means the model memorizes tables instead of generalizing.
- Ignoring IoU: Counting a detected cell as correct without checking overlap can overestimate performance.
Your table extraction model has 98% accuracy but only 12% recall on table cells. Is it good for production? Why or why not?
Answer: No, it is not good. The high accuracy is misleading because most images or areas may not have tables, so predicting no table often is easy. The very low recall means the model misses almost all real table cells, so it fails to extract useful tables. For production, you need much higher recall to capture the tables fully.