For abstractive summarization, the main metrics are ROUGE scores. ROUGE measures how much the model's summary overlaps with a human-written summary. It checks matching words, phrases, and sentence structures. This is important because abstractive summarization creates new sentences, so exact matches are rare. ROUGE helps us see if the summary keeps the main ideas and important details.
Besides ROUGE, sometimes BLEU is used, but ROUGE is preferred because it focuses on recall (capturing all important info) rather than precision.