Extractive Question Answering (QA) helps find exact answers from a given text. It picks the right part of the text as the answer.
Extractive QA concept in NLP
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Syntax
NLP
model = ExtractiveQAModel() answer = model.answer(question, context)
The question is what you want to know.
The context is the text where the answer is found.
Examples
NLP
question = "Where is the Eiffel Tower located?" context = "The Eiffel Tower is in Paris, France." answer = model.answer(question, context)
NLP
question = "Who wrote 'Pride and Prejudice'?" context = "Jane Austen wrote 'Pride and Prejudice' in 1813." answer = model.answer(question, context)
Sample Model
This code uses a ready-made model to find the answer in the context. It prints the answer and how confident the model is.
NLP
from transformers import pipeline # Load a pre-trained extractive QA pipeline qa_pipeline = pipeline('question-answering') context = "The Statue of Liberty is located in New York Harbor. It was a gift from France." question = "Where is the Statue of Liberty located?" result = qa_pipeline(question=question, context=context) print(f"Answer: {result['answer']}") print(f"Score: {result['score']:.2f}")
Important Notes
Extractive QA only picks answers from the given text; it does not generate new information.
Good context quality helps the model find better answers.
Models may give a confidence score showing how sure they are about the answer.
Summary
Extractive QA finds exact answers inside a text.
It works by selecting a part of the context that answers the question.
It is useful for quick fact-finding from documents or chats.
Practice
1. What is the main goal of extractive question answering (QA)?
easy
Solution
Step 1: Understand extractive QA purpose
Extractive QA aims to locate the exact part of the text that answers the question.Step 2: Compare options with definition
Only To find the exact answer span within a given text describes finding the exact answer span inside the text, which matches extractive QA.Final Answer:
To find the exact answer span within a given text -> Option DQuick Check:
Extractive QA = find exact answer span [OK]
Hint: Extractive QA picks text parts, not creates new answers [OK]
Common Mistakes:
- Confusing extractive QA with generative QA
- Thinking extractive QA summarizes text
- Assuming extractive QA translates questions
2. Which of the following is the correct way to represent an extractive QA model's output?
easy
Solution
Step 1: Recall extractive QA output format
Extractive QA models output the start and end positions of the answer span in the text.Step 2: Match options to output format
Only Span of text indices indicating the answer start and end correctly describes output as text span indices.Final Answer:
Span of text indices indicating the answer start and end -> Option AQuick Check:
Output = start and end indices [OK]
Hint: Extractive QA outputs answer span positions, not just length [OK]
Common Mistakes:
- Choosing answer length instead of span indices
- Confusing keywords with answer span
- Thinking output is just true/false
3. Given the context:
'The Eiffel Tower is located in Paris.' and the question: 'Where is the Eiffel Tower?', what would an extractive QA model most likely output?medium
Solution
Step 1: Understand question and context
The question asks for the location of the Eiffel Tower, which is stated as "Paris" in the context.Step 2: Identify exact answer span
The extractive QA model selects the exact text span answering the question, which is "Paris".Final Answer:
"Paris" -> Option CQuick Check:
Answer = "Paris" [OK]
Hint: Extractive QA picks exact answer phrase from context [OK]
Common Mistakes:
- Selecting part of the question as answer
- Choosing unrelated words from context
- Picking longer phrases than needed
4. Consider this extractive QA model output code snippet:
start_idx = 10 end_idx = 5 answer = context[start_idx:end_idx]What is the main issue here?
medium
Solution
Step 1: Analyze index values
The start index is 10 and the end index is 5, which is smaller than start.Step 2: Understand slicing behavior
In Python, slicing with start > end returns an empty string, so no answer is extracted.Final Answer:
The end index is smaller than the start index, causing an empty answer -> Option AQuick Check:
End index < start index = empty slice [OK]
Hint: End index must be >= start index for valid slice [OK]
Common Mistakes:
- Ignoring index order in slicing
- Assuming code runs without error
- Overlooking empty string result
5. You want to improve an extractive QA model to handle questions where the answer might not be present in the context. Which approach is best?
hard
Solution
Step 1: Understand the problem of missing answers
Extractive QA models can fail if forced to select an answer when none exists in context.Step 2: Evaluate solution options
Adding a 'no answer' option lets the model explicitly indicate no answer is found, improving reliability.Final Answer:
Add a 'no answer' prediction option so the model can say answer is missing -> Option BQuick Check:
Handle missing answers = add 'no answer' option [OK]
Hint: Allow model to say 'no answer' when context lacks answer [OK]
Common Mistakes:
- Forcing answer selection even if none exists
- Ignoring questions without answers during training
- Switching to generative models unnecessarily
