Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Question answering in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Question answering
Problem:Build a question answering model that reads a paragraph and answers questions about it.
Current Metrics:Training accuracy: 98%, Validation accuracy: 70%
Issue:The model is overfitting: it performs very well on training data but poorly on validation data.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85%, while keeping training accuracy below 92%.
You can only change model architecture and training hyperparameters.
Do not change the dataset or data preprocessing steps.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras.layers import Input, Embedding, LSTM, Dense, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping

# Sample data placeholders (replace with actual data loading)
X_train = tf.random.uniform((1000, 100), maxval=20000, dtype=tf.int32)
y_train = tf.random.uniform((1000, 1), maxval=2, dtype=tf.int32)
X_val = tf.random.uniform((200, 100), maxval=20000, dtype=tf.int32)
y_val = tf.random.uniform((200, 1), maxval=2, dtype=tf.int32)

vocab_size = 20000
embedding_dim = 64
max_len = 100

inputs = Input(shape=(max_len,))
embedding = Embedding(vocab_size, embedding_dim)(inputs)
lstm = LSTM(64, return_sequences=False)(embedding)
drop = Dropout(0.5)(lstm)
outputs = Dense(1, activation='sigmoid')(drop)

model = Model(inputs, outputs)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_data=(X_val, y_val), callbacks=[early_stop])
Added a Dropout layer with rate 0.5 after the LSTM layer to reduce overfitting.
Reduced LSTM units from 128 to 64 to simplify the model.
Lowered learning rate to 0.001 for smoother training.
Added EarlyStopping callback to stop training when validation loss stops improving.
Results Interpretation

Before: Training accuracy was 98%, validation accuracy was 70%, showing strong overfitting.

After: Training accuracy dropped to 90%, validation accuracy improved to 87%, indicating better generalization.

Adding dropout, reducing model size, lowering learning rate, and using early stopping helps reduce overfitting and improves validation accuracy.
Bonus Experiment
Try using a pretrained language model like BERT for question answering and fine-tune it on the dataset.
💡 Hint
Use Hugging Face transformers library to load a pretrained BERT model and fine-tune with a smaller learning rate.

Practice

(1/5)
1. What is the main purpose of question answering in AI?
easy
A. To find answers from given text or context
B. To generate random text without context
C. To translate languages automatically
D. To create images from descriptions

Solution

  1. Step 1: Understand the goal of question answering

    Question answering systems are designed to find specific answers from a given text or context.
  2. Step 2: Compare options with the goal

    Only To find answers from given text or context describes finding answers from text, which matches the purpose.
  3. Final Answer:

    To find answers from given text or context -> Option A
  4. Quick Check:

    Question answering = find answers [OK]
Hint: Focus on 'answer from text' meaning [OK]
Common Mistakes:
  • Confusing question answering with translation
  • Thinking it generates random text
  • Mixing it with image generation
2. Which input is essential for a question answering model to work?
easy
A. Only a context without a question
B. Only a question without any context
C. A question and a related context or passage
D. Random text unrelated to the question

Solution

  1. Step 1: Identify inputs needed for question answering

    Question answering requires both a question and some context to find the answer.
  2. Step 2: Match options with required inputs

    Only A question and a related context or passage provides both question and related context, which is necessary.
  3. Final Answer:

    A question and a related context or passage -> Option C
  4. Quick Check:

    Question + context = answer [OK]
Hint: Remember: question needs context to answer [OK]
Common Mistakes:
  • Assuming question alone is enough
  • Ignoring the need for context
  • Choosing unrelated text as input
3. Given this Python code using a question answering model:
from transformers import pipeline
qa = pipeline('question-answering')
context = "The Eiffel Tower is in Paris."
question = "Where is the Eiffel Tower located?"
result = qa(question=question, context=context)
print(result['answer'])
What will be printed?
medium
A. Location unknown
B. Eiffel Tower
C. The Eiffel Tower is in Paris
D. Paris

Solution

  1. Step 1: Understand the code's purpose

    The code uses a question answering pipeline to find the answer to the question from the context.
  2. Step 2: Identify the answer in the context

    The question asks for location; the context says "The Eiffel Tower is in Paris." So the answer is "Paris".
  3. Final Answer:

    Paris -> Option D
  4. Quick Check:

    Answer extracted = Paris [OK]
Hint: Look for direct answer in context matching question [OK]
Common Mistakes:
  • Printing the whole context instead of answer
  • Confusing object with location
  • Assuming no answer found
4. This code snippet tries to answer a question but raises an error:
from transformers import pipeline
qa = pipeline('question-answering')
context = "Python is a programming language."
question = "What is Python?"
result = qa(question, context)
print(result['answer'])
What is the error and how to fix it?
medium
A. Error: question is invalid; fix by changing question text
B. Error: missing keyword arguments; fix by using qa(question=question, context=context)
C. Error: context is empty; fix by adding text to context
D. No error; code runs fine

Solution

  1. Step 1: Identify the function call error

    The pipeline expects keyword arguments question= and context=, but code passes positional arguments.
  2. Step 2: Fix the call with correct keywords

    Change to qa(question=question, context=context) to fix the error.
  3. Final Answer:

    Error: missing keyword arguments; fix by using qa(question=question, context=context) -> Option B
  4. Quick Check:

    Use keywords for qa() args [OK]
Hint: Use keyword arguments for question and context [OK]
Common Mistakes:
  • Passing positional args instead of keywords
  • Assuming empty context causes error
  • Changing question text unnecessarily
5. You want to build a question answering system that can handle multiple paragraphs and find the best answer. Which approach is best?
hard
A. Split text into paragraphs, run QA on each, then pick highest confidence answer
B. Combine all paragraphs into one string and run QA once
C. Only use the first paragraph for QA
D. Ignore paragraphs and guess answer randomly

Solution

  1. Step 1: Understand handling multiple paragraphs

    QA models usually work best on smaller text chunks, so splitting helps.
  2. Step 2: Choose method to find best answer

    Running QA on each paragraph separately and selecting the answer with highest confidence ensures accuracy.
  3. Final Answer:

    Split text into paragraphs, run QA on each, then pick highest confidence answer -> Option A
  4. Quick Check:

    Split + score answers = best result [OK]
Hint: Split text, run QA per part, pick best answer [OK]
Common Mistakes:
  • Running QA on all text at once causing confusion
  • Ignoring paragraphs reduces accuracy
  • Guessing answers without context