Prompt Engineering / GenAIml~20 mins

Summarization in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Summarization

Problem:You want to build a text summarization model that can shorten long articles into brief summaries while keeping the main ideas.

Current Metrics:Training loss: 0.15, Validation loss: 0.45, Training ROUGE-1 score: 85%, Validation ROUGE-1 score: 60%

Issue:The model is overfitting: it performs very well on training data but poorly on validation data.

Your Task

Reduce overfitting so that validation ROUGE-1 score improves to at least 75%, while keeping training ROUGE-1 below 85%.

You can only change model hyperparameters and training settings.

You cannot change the dataset or model architecture.

Hint 1

Hint 2

Hint 3

Solution

Prompt Engineering / GenAI

import tensorflow as tf
from tensorflow.keras.layers import Input, LSTM, Dense, Dropout, RepeatVector, TimeDistributed
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping

# Sample data placeholders (replace with actual data loading)
X_train = tf.random.uniform((1000, 100, 300))  # 1000 samples, 100 timesteps, 300 features
Y_train = tf.random.uniform((1000, 20, 300))  # summaries
X_val = tf.random.uniform((200, 100, 300))
Y_val = tf.random.uniform((200, 20, 300))

# Model with dropout added
inputs = Input(shape=(100, 300))
lstm1 = LSTM(256)(inputs)
drop1 = Dropout(0.3)(lstm1)
repeat = RepeatVector(20)(drop1)
lstm2 = LSTM(256, return_sequences=True)(repeat)
drop2 = Dropout(0.3)(lstm2)
outputs = TimeDistributed(Dense(300, activation='softmax'))(drop2)

model = Model(inputs, outputs)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005), loss='categorical_crossentropy')

# Early stopping callback
early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train model
model.fit(X_train, Y_train, epochs=20, batch_size=32, validation_data=(X_val, Y_val), callbacks=[early_stop])

Added Dropout layers with rate 0.3 after LSTM layers to reduce overfitting.

Reduced learning rate from default to 0.0005 for better convergence.

Added EarlyStopping callback to stop training when validation loss stops improving.

Results Interpretation

Before: Training ROUGE-1: 85%, Validation ROUGE-1: 60%, Overfitting present.

After: Training ROUGE-1: 83%, Validation ROUGE-1: 77%, Overfitting reduced.

Adding dropout and early stopping helps the model generalize better, reducing overfitting and improving validation performance.

Bonus Experiment

Try using data augmentation techniques on the training text to further improve validation accuracy.

💡 Hint

You can paraphrase sentences or add noise to input texts to make the model more robust.

Practice

(1/5)

1. What is the main purpose of text summarization in AI?

easy

A. To count the number of words in a text

B. To translate text into another language

C. To generate new text from scratch

D. To make long text shorter and easier to understand

Summarization in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the goal of summarization

Step 2: Compare options with the goal

Final Answer:

Quick Check:

Solution

Step 1: Identify the function for summarization

Step 2: Match function names to tasks

Final Answer:

Quick Check:

Solution

Step 1: Understand summarization output

Step 2: Compare options to expected summary

Final Answer:

Quick Check:

Solution

Step 1: Check method name correctness

Step 2: Verify other code parts

Final Answer:

Quick Check:

Solution

Step 1: Understand extractive vs generative summarization

Step 2: Choose method to keep keywords intact

Final Answer:

Quick Check: