NLPml~20 mins

Answer span extraction in NLP - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Answer span extraction

Problem:We want to build a model that finds the exact answer span in a paragraph given a question. Currently, the model predicts start and end positions of the answer in the text.

Current Metrics:Training loss: 0.15, Training accuracy (exact match): 85%, Validation loss: 0.40, Validation accuracy (exact match): 65%

Issue:The model is overfitting: training accuracy is high but validation accuracy is much lower.

Your Task

Reduce overfitting so that validation accuracy improves to at least 75%, while keeping training accuracy below 90%.

You cannot change the dataset or add more data.

You must keep the same model architecture (a simple BiLSTM with start/end classifiers).

Hint 1

Hint 2

Hint 3

Solution

NLP

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader

class AnswerSpanModel(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, dropout_rate=0.3):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.bilstm = nn.LSTM(embedding_dim, hidden_dim, batch_first=True, bidirectional=True)
        self.dropout = nn.Dropout(dropout_rate)
        self.start_classifier = nn.Linear(hidden_dim * 2, 1)
        self.end_classifier = nn.Linear(hidden_dim * 2, 1)

    def forward(self, x):
        emb = self.embedding(x)
        lstm_out, _ = self.bilstm(emb)
        dropped = self.dropout(lstm_out)
        start_logits = self.start_classifier(dropped).squeeze(-1)
        end_logits = self.end_classifier(dropped).squeeze(-1)
        return start_logits, end_logits

# Assume train_loader and val_loader are defined elsewhere

model = AnswerSpanModel(vocab_size=10000, embedding_dim=100, hidden_dim=64, dropout_rate=0.3)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

best_val_acc = 0
patience = 3
trigger_times = 0

for epoch in range(20):
    model.train()
    for inputs, start_positions, end_positions in train_loader:
        optimizer.zero_grad()
        start_logits, end_logits = model(inputs)
        loss_start = criterion(start_logits, start_positions)
        loss_end = criterion(end_logits, end_positions)
        loss = loss_start + loss_end
        loss.backward()
        optimizer.step()

    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for inputs, start_positions, end_positions in val_loader:
            start_logits, end_logits = model(inputs)
            pred_start = start_logits.argmax(dim=1)
            pred_end = end_logits.argmax(dim=1)
            correct += ((pred_start == start_positions) & (pred_end == end_positions)).sum().item()
            total += inputs.size(0)
    val_acc = correct / total * 100
    print(f"Epoch {epoch+1}, Validation Exact Match Accuracy: {val_acc:.2f}%")

    if val_acc > best_val_acc:
        best_val_acc = val_acc
        trigger_times = 0
    else:
        trigger_times += 1
        if trigger_times >= patience:
            print("Early stopping triggered")
            break

Added dropout layer with rate 0.3 after BiLSTM to reduce overfitting.

Lowered learning rate from 0.01 to 0.001 for better convergence.

Implemented early stopping with patience of 3 epochs to avoid overtraining.

Results Interpretation

Before: Training accuracy 85%, Validation accuracy 65% (overfitting)

After: Training accuracy 88%, Validation accuracy 77% (reduced overfitting)

Adding dropout and early stopping helps the model generalize better, reducing the gap between training and validation accuracy.

Bonus Experiment

Try using a pretrained language model like BERT for answer span extraction to improve accuracy.

💡 Hint

Use Hugging Face transformers library and fine-tune a BERT model on the same dataset.

Practice

(1/5)

1. What is the main goal of answer span extraction in NLP?

easy

A. To generate new text based on a prompt

B. To find the exact part of text that answers a question

C. To summarize long documents into short sentences

D. To translate text from one language to another

Answer span extraction in NLP - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of answer span extraction

Step 2: Compare with other NLP tasks

Final Answer:

Quick Check:

Solution

Step 1: Identify typical data types for positions

Step 2: Evaluate options

Final Answer:

Quick Check:

Solution

Step 1: Identify tokens and their indices

Step 2: Extract tokens from start to end index

Final Answer:

Quick Check:

Solution

Step 1: Understand the problem with indices

Step 2: Choose a fix that preserves valid spans

Final Answer:

Quick Check:

Solution

Step 1: Understand logits for start and end tokens

Step 2: Combine logits to find best span

Final Answer:

Quick Check: