Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Summarization in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Summarization
Problem:You want to build a text summarization model that can shorten long articles into brief summaries while keeping the main ideas.
Current Metrics:Training loss: 0.15, Validation loss: 0.45, Training ROUGE-1 score: 85%, Validation ROUGE-1 score: 60%
Issue:The model is overfitting: it performs very well on training data but poorly on validation data.
Your Task
Reduce overfitting so that validation ROUGE-1 score improves to at least 75%, while keeping training ROUGE-1 below 85%.
You can only change model hyperparameters and training settings.
You cannot change the dataset or model architecture.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras.layers import Input, LSTM, Dense, Dropout, RepeatVector, TimeDistributed
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping

# Sample data placeholders (replace with actual data loading)
X_train = tf.random.uniform((1000, 100, 300))  # 1000 samples, 100 timesteps, 300 features
Y_train = tf.random.uniform((1000, 20, 300))  # summaries
X_val = tf.random.uniform((200, 100, 300))
Y_val = tf.random.uniform((200, 20, 300))

# Model with dropout added
inputs = Input(shape=(100, 300))
lstm1 = LSTM(256)(inputs)
drop1 = Dropout(0.3)(lstm1)
repeat = RepeatVector(20)(drop1)
lstm2 = LSTM(256, return_sequences=True)(repeat)
drop2 = Dropout(0.3)(lstm2)
outputs = TimeDistributed(Dense(300, activation='softmax'))(drop2)

model = Model(inputs, outputs)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005), loss='categorical_crossentropy')

# Early stopping callback
early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train model
model.fit(X_train, Y_train, epochs=20, batch_size=32, validation_data=(X_val, Y_val), callbacks=[early_stop])
Added Dropout layers with rate 0.3 after LSTM layers to reduce overfitting.
Reduced learning rate from default to 0.0005 for better convergence.
Added EarlyStopping callback to stop training when validation loss stops improving.
Results Interpretation

Before: Training ROUGE-1: 85%, Validation ROUGE-1: 60%, Overfitting present.

After: Training ROUGE-1: 83%, Validation ROUGE-1: 77%, Overfitting reduced.

Adding dropout and early stopping helps the model generalize better, reducing overfitting and improving validation performance.
Bonus Experiment
Try using data augmentation techniques on the training text to further improve validation accuracy.
💡 Hint
You can paraphrase sentences or add noise to input texts to make the model more robust.

Practice

(1/5)
1. What is the main purpose of text summarization in AI?
easy
A. To count the number of words in a text
B. To translate text into another language
C. To generate new text from scratch
D. To make long text shorter and easier to understand

Solution

  1. Step 1: Understand the goal of summarization

    Summarization aims to reduce the length of text while keeping the main ideas clear.
  2. Step 2: Compare options with the goal

    Only To make long text shorter and easier to understand describes making text shorter and easier to understand, which matches summarization.
  3. Final Answer:

    To make long text shorter and easier to understand -> Option D
  4. Quick Check:

    Summarization = shorten text [OK]
Hint: Summarization shortens text for quick understanding [OK]
Common Mistakes:
  • Confusing summarization with translation
  • Thinking summarization creates new text
  • Mixing summarization with word counting
2. Which of the following is the correct way to call a summarization model in Python using a fictional API?
easy
A. summary = model.summarize(text)
B. summary = model.translate(text)
C. summary = model.generate(text)
D. summary = model.count_words(text)

Solution

  1. Step 1: Identify the function for summarization

    The function to get a summary should be named something like 'summarize' to match the task.
  2. Step 2: Match function names to tasks

    Only 'model.summarize(text)' fits the summarization task; others do translation, generation, or counting.
  3. Final Answer:

    summary = model.summarize(text) -> Option A
  4. Quick Check:

    Summarize function call = summary = model.summarize(text) [OK]
Hint: Look for 'summarize' function for summarization calls [OK]
Common Mistakes:
  • Using translate() instead of summarize()
  • Using generate() which creates new text
  • Using count_words() which is unrelated
3. Given the code below, what will be the output?
text = "AI helps us by making complex tasks easier."
summary = model.summarize(text)
print(summary)
Assuming the model works correctly, what is the likely output?
medium
A. "AI simplifies complex tasks."
B. "AI translates text."
C. "AI helps us by making complex tasks easier."
D. "AI counts words in text."

Solution

  1. Step 1: Understand summarization output

    The summary should be a shorter version of the original text keeping the main idea.
  2. Step 2: Compare options to expected summary

    "AI simplifies complex tasks." shortens the sentence while keeping meaning; "AI helps us by making complex tasks easier." is original text, others unrelated.
  3. Final Answer:

    "AI simplifies complex tasks." -> Option A
  4. Quick Check:

    Summary shortens text = "AI simplifies complex tasks." [OK]
Hint: Summary is shorter but keeps main idea [OK]
Common Mistakes:
  • Thinking summary is the same as original text
  • Confusing summarization with translation
  • Expecting unrelated outputs like word count
4. The following code throws an error. What is the likely cause?
text = "Summarize this text."
summary = model.summarize_text(text)
print(summary)
medium
A. The variable 'text' is not defined
B. The method name 'summarize_text' is incorrect
C. The print statement is missing parentheses
D. The model object is not created

Solution

  1. Step 1: Check method name correctness

    The correct method to summarize is likely 'summarize', not 'summarize_text'.
  2. Step 2: Verify other code parts

    The variable 'text' is defined, print has parentheses, and model object assumed created.
  3. Final Answer:

    The method name 'summarize_text' is incorrect -> Option B
  4. Quick Check:

    Method name must be correct = The method name 'summarize_text' is incorrect [OK]
Hint: Check method names carefully for typos [OK]
Common Mistakes:
  • Assuming variable 'text' is undefined
  • Forgetting print needs parentheses
  • Ignoring if model object exists
5. You want to summarize a long article but keep important keywords intact. Which approach is best?
hard
A. Use translation model to convert text language
B. Use generative summarization to rewrite text freely
C. Use extractive summarization to select key sentences
D. Use word count to find important words

Solution

  1. Step 1: Understand extractive vs generative summarization

    Extractive picks actual sentences from text, preserving keywords; generative rewrites freely.
  2. Step 2: Choose method to keep keywords intact

    Extractive summarization keeps original sentences and keywords, so it fits the need best.
  3. Final Answer:

    Use extractive summarization to select key sentences -> Option C
  4. Quick Check:

    Keep keywords = extractive summarization [OK]
Hint: Extractive keeps original words; generative rewrites [OK]
Common Mistakes:
  • Confusing generative with extractive summarization
  • Using translation instead of summarization
  • Relying on word count alone for keywords