Choose the option that best describes how abstractive summarization differs from extractive summarization.
Think about whether the summary is created by copying or by generating new text.
Abstractive summarization creates new sentences that may not appear in the original text, capturing the core meaning. Extractive summarization picks sentences directly from the text.
Given the following Python code using Hugging Face's transformers library, what is the output summary?
from transformers import pipeline summarizer = pipeline('summarization') text = "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower. The tower is 324 meters tall and was completed in 1889." summary = summarizer(text, max_length=30, min_length=10, do_sample=False) print(summary[0]['summary_text'])
Look for the most detailed and complete summary that fits the length constraints.
The summarizer produces a concise summary that includes key facts: height, location, builder, and completion year, matching option A.
In an abstractive summarization model, what is the effect of increasing the max_length parameter during generation?
Think about what controlling the maximum length of output text means for the summary size.
Increasing max_length allows the model to produce longer summaries, potentially including more information.
Choose the metric that best measures how well an abstractive summarization model captures the meaning of the original text.
Think about a metric that compares summaries based on overlapping phrases.
ROUGE is widely used for summarization evaluation because it measures overlap of words and phrases between generated and reference summaries.
Consider this code snippet using a pretrained summarization model:
from transformers import pipeline
summarizer = pipeline('summarization')
text = "Deep learning models are powerful. Deep learning models are powerful. Deep learning models are powerful."
summary = summarizer(text, max_length=20, min_length=5, do_sample=False)
print(summary[0]['summary_text'])The output is: "Deep learning models are powerful. Deep learning models are powerful. Deep learning models are powerful." What is the most likely cause?
Consider how generation parameters affect repetition in output.
Using do_sample=False with greedy decoding can cause repetitive outputs, especially on repetitive input. This is a known issue with lack of diversity in generation.