Experiment - Long document summarization strategies
Problem:You want to create a model that summarizes very long documents into short, clear summaries. The current model uses a simple transformer but struggles to handle long texts well.
Current Metrics:Training loss: 0.15, Validation loss: 0.45, Training ROUGE-1: 85%, Validation ROUGE-1: 60%
Issue:The model overfits the training data and performs poorly on validation data because it cannot effectively process long documents.