What if you could find any answer hidden in a book in just a second?
Why Extractive QA concept in NLP? - Purpose & Use Cases
Start learning this pattern below
Jump into concepts and practice - no test required
Imagine you have a huge book and someone asks you a specific question about its content. You try to find the exact answer by reading page after page, line after line, hoping to spot the right sentence.
This manual search is slow and tiring. You might miss the answer or pick the wrong part. It's easy to get lost in too much text and waste time flipping pages.
Extractive Question Answering (QA) uses smart models to quickly scan the text and pick out the exact words or sentences that answer the question. It's like having a super-fast helper who knows where to look.
answer = None for line in document: if question_keywords in line: answer = line break
answer = model.extract_answer(question, document)
It lets us get precise answers from large texts instantly, making information easy to find and use.
Customer support bots that read product manuals and instantly answer user questions without making them wait or search themselves.
Manual searching is slow and error-prone.
Extractive QA finds exact answers quickly from text.
This makes accessing information fast and reliable.
Practice
Solution
Step 1: Understand extractive QA purpose
Extractive QA aims to locate the exact part of the text that answers the question.Step 2: Compare options with definition
Only To find the exact answer span within a given text describes finding the exact answer span inside the text, which matches extractive QA.Final Answer:
To find the exact answer span within a given text -> Option DQuick Check:
Extractive QA = find exact answer span [OK]
- Confusing extractive QA with generative QA
- Thinking extractive QA summarizes text
- Assuming extractive QA translates questions
Solution
Step 1: Recall extractive QA output format
Extractive QA models output the start and end positions of the answer span in the text.Step 2: Match options to output format
Only Span of text indices indicating the answer start and end correctly describes output as text span indices.Final Answer:
Span of text indices indicating the answer start and end -> Option AQuick Check:
Output = start and end indices [OK]
- Choosing answer length instead of span indices
- Confusing keywords with answer span
- Thinking output is just true/false
'The Eiffel Tower is located in Paris.' and the question: 'Where is the Eiffel Tower?', what would an extractive QA model most likely output?Solution
Step 1: Understand question and context
The question asks for the location of the Eiffel Tower, which is stated as "Paris" in the context.Step 2: Identify exact answer span
The extractive QA model selects the exact text span answering the question, which is "Paris".Final Answer:
"Paris" -> Option CQuick Check:
Answer = "Paris" [OK]
- Selecting part of the question as answer
- Choosing unrelated words from context
- Picking longer phrases than needed
start_idx = 10 end_idx = 5 answer = context[start_idx:end_idx]What is the main issue here?
Solution
Step 1: Analyze index values
The start index is 10 and the end index is 5, which is smaller than start.Step 2: Understand slicing behavior
In Python, slicing with start > end returns an empty string, so no answer is extracted.Final Answer:
The end index is smaller than the start index, causing an empty answer -> Option AQuick Check:
End index < start index = empty slice [OK]
- Ignoring index order in slicing
- Assuming code runs without error
- Overlooking empty string result
Solution
Step 1: Understand the problem of missing answers
Extractive QA models can fail if forced to select an answer when none exists in context.Step 2: Evaluate solution options
Adding a 'no answer' option lets the model explicitly indicate no answer is found, improving reliability.Final Answer:
Add a 'no answer' prediction option so the model can say answer is missing -> Option BQuick Check:
Handle missing answers = add 'no answer' option [OK]
- Forcing answer selection even if none exists
- Ignoring questions without answers during training
- Switching to generative models unnecessarily
