Bird
Raised Fist0
NLPml~20 mins

Extractive QA concept in NLP - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Extractive QA Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
What is the main goal of extractive question answering?
In extractive question answering, what is the system primarily designed to do?
AGenerate a completely new answer not present in the text
BClassify the question into predefined categories
CSummarize the entire passage into a short paragraph
DSelect a span of text from the given passage as the answer
Attempts:
2 left
💡 Hint
Think about whether the answer is created or taken directly from the text.
Predict Output
intermediate
1:30remaining
Output of extractive QA model span prediction
Given a passage and a question, an extractive QA model outputs start and end indices of the answer span. If the passage is "The sky is blue and clear today." and the model predicts start index 3 and end index 5, what is the extracted answer?
NLP
passage = "The sky is blue and clear today."
start_idx = 3
end_idx = 5
words = passage.split()
answer = ' '.join(words[start_idx:end_idx+1])
print(answer)
A"and clear today."
B"is blue and"
C"blue and clear"
D"sky is blue"
Attempts:
2 left
💡 Hint
Check the words between indices 3 and 5 inclusive.
Model Choice
advanced
2:00remaining
Best model architecture for extractive QA
Which model architecture is most suitable for extractive question answering tasks?
APretrained Transformer model fine-tuned to predict start and end token positions
BSequence-to-sequence model trained for translation tasks
CGenerative Transformer model like GPT that generates answers from scratch
DConvolutional Neural Network for image classification
Attempts:
2 left
💡 Hint
Extractive QA requires locating answer spans, not generating new text.
Metrics
advanced
1:30remaining
Which metric best evaluates extractive QA performance?
For extractive question answering, which metric is commonly used to measure how well the predicted answer matches the true answer span?
ABLEU score measuring n-gram overlap
BExact Match (EM) score checking exact span match
CMean Squared Error between predicted and true spans
DAccuracy of classifying question types
Attempts:
2 left
💡 Hint
Think about a metric that checks if the predicted answer exactly matches the true answer.
🔧 Debug
expert
2:30remaining
Why does this extractive QA model fail to find the answer span?
A fine-tuned extractive QA model always predicts the start and end indices as 0, regardless of input. What is the most likely cause?
NLP
def predict_span(model, input_ids):
    start_scores, end_scores = model(input_ids)
    start_index = start_scores.argmax()
    end_index = end_scores.argmax()
    return start_index, end_index

# Model always returns start_index=0 and end_index=0
AThe model was not fine-tuned and outputs default scores favoring index 0
BThe input_ids are empty, so model cannot predict spans
CThe argmax function is used incorrectly causing wrong indices
DThe model architecture is incompatible with extractive QA
Attempts:
2 left
💡 Hint
Consider what happens if the model is not trained on the task.

Practice

(1/5)
1. What is the main goal of extractive question answering (QA)?
easy
A. To translate the question into another language
B. To generate a new answer not present in the text
C. To summarize the entire text into a short paragraph
D. To find the exact answer span within a given text

Solution

  1. Step 1: Understand extractive QA purpose

    Extractive QA aims to locate the exact part of the text that answers the question.
  2. Step 2: Compare options with definition

    Only To find the exact answer span within a given text describes finding the exact answer span inside the text, which matches extractive QA.
  3. Final Answer:

    To find the exact answer span within a given text -> Option D
  4. Quick Check:

    Extractive QA = find exact answer span [OK]
Hint: Extractive QA picks text parts, not creates new answers [OK]
Common Mistakes:
  • Confusing extractive QA with generative QA
  • Thinking extractive QA summarizes text
  • Assuming extractive QA translates questions
2. Which of the following is the correct way to represent an extractive QA model's output?
easy
A. Span of text indices indicating the answer start and end
B. Single integer representing the answer length
C. List of unrelated keywords from the text
D. Boolean value indicating if the answer exists

Solution

  1. Step 1: Recall extractive QA output format

    Extractive QA models output the start and end positions of the answer span in the text.
  2. Step 2: Match options to output format

    Only Span of text indices indicating the answer start and end correctly describes output as text span indices.
  3. Final Answer:

    Span of text indices indicating the answer start and end -> Option A
  4. Quick Check:

    Output = start and end indices [OK]
Hint: Extractive QA outputs answer span positions, not just length [OK]
Common Mistakes:
  • Choosing answer length instead of span indices
  • Confusing keywords with answer span
  • Thinking output is just true/false
3. Given the context: 'The Eiffel Tower is located in Paris.' and the question: 'Where is the Eiffel Tower?', what would an extractive QA model most likely output?
medium
A. "Eiffel Tower"
B. "located"
C. "Paris"
D. "The Eiffel Tower is located"

Solution

  1. Step 1: Understand question and context

    The question asks for the location of the Eiffel Tower, which is stated as "Paris" in the context.
  2. Step 2: Identify exact answer span

    The extractive QA model selects the exact text span answering the question, which is "Paris".
  3. Final Answer:

    "Paris" -> Option C
  4. Quick Check:

    Answer = "Paris" [OK]
Hint: Extractive QA picks exact answer phrase from context [OK]
Common Mistakes:
  • Selecting part of the question as answer
  • Choosing unrelated words from context
  • Picking longer phrases than needed
4. Consider this extractive QA model output code snippet:
start_idx = 10
end_idx = 5
answer = context[start_idx:end_idx]
What is the main issue here?
medium
A. The end index is smaller than the start index, causing an empty answer
B. The indices are correct and will extract the answer properly
C. The code is missing a question input
D. The context variable is undefined

Solution

  1. Step 1: Analyze index values

    The start index is 10 and the end index is 5, which is smaller than start.
  2. Step 2: Understand slicing behavior

    In Python, slicing with start > end returns an empty string, so no answer is extracted.
  3. Final Answer:

    The end index is smaller than the start index, causing an empty answer -> Option A
  4. Quick Check:

    End index < start index = empty slice [OK]
Hint: End index must be >= start index for valid slice [OK]
Common Mistakes:
  • Ignoring index order in slicing
  • Assuming code runs without error
  • Overlooking empty string result
5. You want to improve an extractive QA model to handle questions where the answer might not be present in the context. Which approach is best?
hard
A. Use a generative model instead of extractive QA
B. Add a 'no answer' prediction option so the model can say answer is missing
C. Train the model only on questions with guaranteed answers
D. Force the model to always select some text span regardless

Solution

  1. Step 1: Understand the problem of missing answers

    Extractive QA models can fail if forced to select an answer when none exists in context.
  2. Step 2: Evaluate solution options

    Adding a 'no answer' option lets the model explicitly indicate no answer is found, improving reliability.
  3. Final Answer:

    Add a 'no answer' prediction option so the model can say answer is missing -> Option B
  4. Quick Check:

    Handle missing answers = add 'no answer' option [OK]
Hint: Allow model to say 'no answer' when context lacks answer [OK]
Common Mistakes:
  • Forcing answer selection even if none exists
  • Ignoring questions without answers during training
  • Switching to generative models unnecessarily