LangChainframework~30 mins

Source citation in RAG responses in LangChain - Mini Project: Build & Apply

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Source citation in RAG responses

📖 Scenario: You are building a simple Retrieval-Augmented Generation (RAG) system using LangChain. This system answers questions by retrieving relevant documents and then generating answers that include citations to the sources.

🎯 Goal: Create a LangChain script that retrieves documents for a query and generates an answer that includes source citations from the retrieved documents.

📋 What You'll Learn

Create a list called documents with three strings representing document texts.

Create a variable called query with the exact string 'What is LangChain?'.

Use LangChain's FAISS vector store to index the documents.

Create a retriever from the FAISS index with k=2 to get top 2 documents.

Use LangChain's RetrievalQA chain with a dummy LLM that returns a fixed answer including citations.

Generate an answer for the query that includes citations from the retrieved documents.

💡 Why This Matters

🌍 Real World

RAG systems are used in chatbots, virtual assistants, and search engines to provide accurate answers with references to trusted sources.

💼 Career

Understanding how to build RAG pipelines with source citation is valuable for AI developers, data scientists, and software engineers working on intelligent applications.

Progress0 / 4 steps

DATA SETUP: Create documents and query

Create a list called documents with these exact strings: 'LangChain is a framework for building applications with LLMs.', 'It helps connect LLMs with external data sources.', and 'Source citation is important in RAG systems.'. Also create a variable called query with the exact string 'What is LangChain?'.

LangChain

# Create the documents list and query variable
# Your code here

Need a hint?

Use a Python list for documents and assign the exact strings. Then assign the exact string to query.

CONFIGURATION: Create FAISS index and retriever

Import FAISS and OpenAIEmbeddings from LangChain. Create an embeddings variable using OpenAIEmbeddings(). Then create a FAISS index called index by calling FAISS.from_texts(documents, embeddings). Finally, create a retriever called retriever from index.as_retriever() with search_kwargs={'k': 2}.

LangChain

documents = [
    'LangChain is a framework for building applications with LLMs.',
    'It helps connect LLMs with external data sources.',
    'Source citation is important in RAG systems.'
]
query = 'What is LangChain?'

# Import FAISS and OpenAIEmbeddings
# Create embeddings variable
# Create FAISS index from documents
# Create retriever with k=2
# Your code here

Need a hint?

Import the classes, create embeddings, then create the FAISS index and retriever with k=2.

CORE LOGIC: Create RetrievalQA chain with dummy LLM

Import RetrievalQA from LangChain. Create a dummy LLM class called DummyLLM with a __call__ method that takes prompt and returns the string 'LangChain is a framework for LLM apps. [Sources: Doc1, Doc2]'. Then create a llm instance of DummyLLM. Finally, create a qa variable by calling RetrievalQA.from_chain_type(llm=llm, retriever=retriever).

LangChain

from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

# Define DummyLLM class with __call__ method
# Create llm instance
# Import RetrievalQA
# Create qa chain with llm and retriever
# Your code here

Need a hint?

Create a dummy LLM class that returns a fixed answer with citations. Then create the RetrievalQA chain using this LLM and the retriever.

COMPLETION: Generate answer with citations

Call qa.run(query) and assign the result to a variable called answer. This will generate the answer including source citations.

LangChain

from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA

class DummyLLM:
    def __call__(self, prompt):
        return 'LangChain is a framework for LLM apps. [Sources: Doc1, Doc2]'


# Generate the answer for the query using qa.run
# Your code here

Need a hint?

Use qa.run(query) to get the answer with citations and assign it to answer.