Model Pipeline - ROUGE evaluation metrics
The ROUGE evaluation metrics measure how well a machine-generated summary matches a human-written summary by comparing overlapping units like words and phrases.
The ROUGE evaluation metrics measure how well a machine-generated summary matches a human-written summary by comparing overlapping units like words and phrases.
Loss
0.5 |****
0.4 |******
0.3 |********
0.2 |**********
1 2 3 4 5 Epochs| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | 0.45 | 0.60 | Initial ROUGE scores show moderate overlap between summaries. |
| 2 | 0.38 | 0.68 | ROUGE scores improve as model generates better summaries. |
| 3 | 0.32 | 0.74 | Further improvement in overlap and summary quality. |
| 4 | 0.28 | 0.78 | Model converges with higher ROUGE scores. |
| 5 | 0.25 | 0.81 | Final epoch shows best ROUGE evaluation metrics. |