What is Source citation in RAG responses in LangChain?

LangChainframework~5 mins

Source citation in RAG responses in LangChain

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Introduction

Source citation helps you know where the information in your AI answers comes from. It makes the answers trustworthy and easy to check.

When you want your AI to give answers with proof from documents.

When users need to verify facts from original sources.

When building chatbots that answer questions using company manuals or articles.

When you want to avoid AI making up information.

When you want to show users exactly which document or page the answer is based on.

Syntax

LangChain

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

retriever = FAISS.load_local("my_faiss_index", OpenAIEmbeddings())
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

result = qa({"query": "What is the capital of France?"})
print(result['result'])
print(result['source_documents'])

Set return_source_documents=True to get source info with answers.

Source documents include metadata like page number or document name.

Examples

This example shows how to get the answer and metadata about the sources.

LangChain

qa = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

result = qa({"query": "Explain photosynthesis."})
print(result['result'])
print([doc.metadata for doc in result['source_documents']])

You can also pass the question as a dictionary and print full source content and metadata.

LangChain

result = qa({'query': "Who wrote Hamlet?"})
print(result['result'])
for doc in result['source_documents']:
    print(doc.page_content)
    print(doc.metadata)

Sample Program

This program loads a FAISS index with embeddings, creates a retrieval QA chain that returns source documents, asks a question, and prints the answer along with metadata and a snippet from each source document.

LangChain

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

# Load embeddings and FAISS index
embeddings = OpenAIEmbeddings()
retriever = FAISS.load_local("faiss_index", embeddings)

# Create QA chain with source documents returned
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(temperature=0),
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

# Ask a question
query = "What is the tallest mountain in the world?"
result = qa({'query': query})

# Print answer
print("Answer:", result['result'])

# Print source citations
print("Sources:")
for i, doc in enumerate(result['source_documents'], 1):
    print(f"Source {i} metadata:", doc.metadata)
    print(f"Source {i} content snippet:", doc.page_content[:100], "...")

OutputSuccess

Important Notes

Make sure your documents have metadata like source name or page number for clear citations.

Returning source documents may slow down response time slightly but improves trust.

You can customize how source info is shown by processing the metadata and content.

Summary

Source citation shows where AI answers come from, making them trustworthy.

Set return_source_documents=True in Langchain RetrievalQA to get sources.

Print metadata and content snippets to share clear citations with users.