Open-domain Question Answering (QA) helps computers find answers to any question from a large collection of texts. It makes information easy to get, like asking a smart assistant.
0
0
Open-domain QA basics in NLP
Introduction
You want to build a chatbot that answers general questions from Wikipedia.
You need a system that finds quick answers from a big document collection.
You want to help users get facts without reading long articles.
You want to create a voice assistant that answers any question.
You want to test how well a model understands language and facts.
Syntax
NLP
1. Input: A question in natural language. 2. Retrieve: Find relevant documents or passages from a large text collection. 3. Reader: Use a model to read the retrieved text and find the exact answer. 4. Output: Return the answer text.
The process usually has two parts: retrieval and reading.
Models like BERT or GPT can be used as readers to find answers.
Examples
This example shows how the system finds the author of a book by searching and reading.
NLP
Question: "Who wrote the book '1984'?" Retrieve: Find Wikipedia page about '1984'. Reader: Extract 'George Orwell' as answer. Output: "George Orwell"
Simple fact question answered by retrieving and reading relevant text.
NLP
Question: "What is the capital of France?" Retrieve: Find documents about France. Reader: Extract 'Paris' as answer. Output: "Paris"
Sample Model
This code uses a ready-made model to answer a question from a given text. It prints the answer and confidence score.
NLP
from transformers import pipeline # Load a question-answering pipeline qa = pipeline('question-answering') # Define the question and context question = "Who developed the theory of relativity?" context = ( "Albert Einstein was a physicist who developed the theory of relativity, one of the two pillars of modern physics." ) # Get the answer result = qa(question=question, context=context) print(f"Answer: {result['answer']}") print(f"Score: {result['score']:.2f}")
OutputSuccess
Important Notes
Open-domain QA needs a large text collection to find answers from.
Retrieval quality affects how good the final answer is.
Pretrained language models help readers understand and extract answers well.
Summary
Open-domain QA finds answers to any question from large texts.
It works by retrieving relevant texts and reading them to find answers.
Pretrained models like BERT make reading and answering easier.