NLPml~12 mins

Extractive QA concept in NLP - Model Pipeline Trace

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Model Pipeline - Extractive QA concept

This pipeline finds answers by picking exact text spans from a given passage based on a question. It reads the passage and question, then highlights the answer inside the passage.

Data Flow - 5 Stages

1Input Data

1000 samples x 2 texts (question, passage)→Receive pairs of question and passage texts→1000 samples x 2 texts (question, passage)

Question: 'Where is the Eiffel Tower located?'; Passage: 'The Eiffel Tower is in Paris, France, and is a famous landmark.'

↓

2Tokenization

1000 samples x 2 texts→Split texts into tokens (words or subwords)→1000 samples x 2 token lists (question tokens, passage tokens)

Question tokens: ['Where', 'is', 'the', 'Eiffel', 'Tower', 'located', '?']; Passage tokens: ['The', 'Eiffel', 'Tower', 'is', 'in', 'Paris', ',', 'France', ',', 'and', 'is', 'a', 'famous', 'landmark', '.']

↓

3Input Encoding

1000 samples x 2 token lists→Convert tokens to numerical vectors using embeddings→1000 samples x 2 sequences of vectors (e.g., 768-dim)

Question vectors: [[0.1, 0.3, ...], ...]; Passage vectors: [[0.2, 0.4, ...], ...]

↓

4Model Forward Pass

1000 samples x 2 sequences of vectors→Use a neural network (e.g., BERT) to predict start and end positions of answer in passage→1000 samples x 2 probability distributions over passage tokens (start and end)

Start probs: [0.01, 0.02, 0.7, 0.1, ...]; End probs: [0.01, 0.02, 0.1, 0.6, ...]

↓

5Answer Extraction

1000 samples x 2 probability distributions→Select token span with highest combined start and end probabilities→1000 samples x 1 text span (answer)

Answer: 'Paris, France'

Training Trace - Epoch by Epoch


Loss: 1.2 |****
      0.8 |******
      0.5 |*********
      0.35|***********
      0.3 |************
       Epochs -> 1 2 3 4 5

Epoch	Loss ↓	Accuracy ↑	Observation
1	1.2	0.45	Model starts learning, loss is high, accuracy low
2	0.8	0.60	Loss decreases, accuracy improves as model learns to locate answers
3	0.5	0.75	Model shows good understanding, loss continues to drop
4	0.35	0.82	Training converges, accuracy stabilizes near 82%
5	0.30	0.85	Final epoch, model achieves strong performance

Prediction Trace - 5 Layers

Layer 1: Tokenization

Layer 2: Input Encoding

Layer 3: Model Forward Pass

Layer 4: Answer Extraction

Layer 5: Detokenization

Model Quiz - 3 Questions

Test your understanding

What does the model predict to find the answer in the passage?

AOnly the first word of the passage

BStart and end positions of the answer span

CThe entire passage as the answer

DA summary of the passage

Key Insight

Extractive QA models learn to locate exact answer spans by predicting start and end positions in the passage. As training progresses, the model improves by reducing loss and increasing accuracy, enabling precise answer extraction from text.

Practice

(1/5)

1. What is the main goal of extractive question answering (QA)?

easy

A. To translate the question into another language

B. To generate a new answer not present in the text

C. To summarize the entire text into a short paragraph

D. To find the exact answer span within a given text

Extractive QA concept in NLP - Model Pipeline Trace

Start learning this pattern below

Practice

Solution

Step 1: Understand extractive QA purpose

Step 2: Compare options with definition

Final Answer:

Quick Check:

Solution

Step 1: Recall extractive QA output format

Step 2: Match options to output format

Final Answer:

Quick Check:

Solution

Step 1: Understand question and context

Step 2: Identify exact answer span

Final Answer:

Quick Check:

Solution

Step 1: Analyze index values

Step 2: Understand slicing behavior

Final Answer:

Quick Check:

Solution

Step 1: Understand the problem of missing answers

Step 2: Evaluate solution options

Final Answer:

Quick Check: