0
0
LangChainframework~5 mins

Source citation in RAG responses in LangChain

Choose your learning style9 modes available
Introduction
Source citation helps you know where the information in your AI answers comes from. It makes the answers trustworthy and easy to check.
When you want your AI to give answers with proof from documents.
When users need to verify facts from original sources.
When building chatbots that answer questions using company manuals or articles.
When you want to avoid AI making up information.
When you want to show users exactly which document or page the answer is based on.
Syntax
LangChain
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

retriever = FAISS.load_local("my_faiss_index", OpenAIEmbeddings())
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

result = qa({"query": "What is the capital of France?"})
print(result['result'])
print(result['source_documents'])
Set return_source_documents=True to get source info with answers.
Source documents include metadata like page number or document name.
Examples
This example shows how to get the answer and metadata about the sources.
LangChain
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

result = qa({"query": "Explain photosynthesis."})
print(result['result'])
print([doc.metadata for doc in result['source_documents']])
You can also pass the question as a dictionary and print full source content and metadata.
LangChain
result = qa({'query': "Who wrote Hamlet?"})
print(result['result'])
for doc in result['source_documents']:
    print(doc.page_content)
    print(doc.metadata)
Sample Program
This program loads a FAISS index with embeddings, creates a retrieval QA chain that returns source documents, asks a question, and prints the answer along with metadata and a snippet from each source document.
LangChain
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

# Load embeddings and FAISS index
embeddings = OpenAIEmbeddings()
retriever = FAISS.load_local("faiss_index", embeddings)

# Create QA chain with source documents returned
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(temperature=0),
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

# Ask a question
query = "What is the tallest mountain in the world?"
result = qa({'query': query})

# Print answer
print("Answer:", result['result'])

# Print source citations
print("Sources:")
for i, doc in enumerate(result['source_documents'], 1):
    print(f"Source {i} metadata:", doc.metadata)
    print(f"Source {i} content snippet:", doc.page_content[:100], "...")
OutputSuccess
Important Notes
Make sure your documents have metadata like source name or page number for clear citations.
Returning source documents may slow down response time slightly but improves trust.
You can customize how source info is shown by processing the metadata and content.
Summary
Source citation shows where AI answers come from, making them trustworthy.
Set return_source_documents=True in Langchain RetrievalQA to get sources.
Print metadata and content snippets to share clear citations with users.