0
0
LangChainframework~10 mins

Memory-augmented retrieval in LangChain - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Memory-augmented retrieval
User Query Input
Check Memory for Context
Retrieve Relevant Memory Entries
Combine Query + Memory Context
Send to Retriever/LLM
Get Response
Update Memory with New Info
Output Answer
The system takes a user query, checks stored memory for relevant context, combines them, sends to the retriever or language model, then updates memory with new information before outputting the answer.
Execution Sample
LangChain
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain

# Assume llm and retriever are defined (e.g., llm = ChatOpenAI(), retriever = vectorstore.as_retriever())
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chain = ConversationalRetrievalChain.from_llm(llm, retriever, memory=memory)
response = chain.invoke({"question": "What is AI?"})["answer"]
This code creates a memory buffer and a conversational retrieval chain that uses this memory to answer the question 'What is AI?'.
Execution Table
StepActionMemory State BeforeMemory RetrievedCombined QueryResponse GeneratedMemory State After
1User inputs query 'What is AI?'EmptyNoneWhat is AI?N/AEmpty
2Retrieve relevant memory entriesEmptyNoneWhat is AI?N/AEmpty
3Send combined query to retriever/LLMEmptyNoneWhat is AI?AI is Artificial Intelligence...Empty
4Update memory with new infoEmptyNoneWhat is AI?AI is Artificial Intelligence...Memory updated with Q&A
5Output answerMemory updated with Q&AN/AWhat is AI?AI is Artificial Intelligence...Memory updated with Q&A
💡 Process ends after outputting the answer and updating memory with the new conversation.
Variable Tracker
VariableStartAfter Step 1After Step 4Final
memoryEmptyEmptyContains Q: 'What is AI?' and A: 'AI is Artificial Intelligence...'Contains Q: 'What is AI?' and A: 'AI is Artificial Intelligence...'
queryN/A'What is AI?''What is AI?''What is AI?'
responseN/AN/A'AI is Artificial Intelligence...''AI is Artificial Intelligence...'
Key Moments - 3 Insights
Why does the system check memory before sending the query to the retriever?
Checking memory first helps add context from past conversations, making the response more relevant. See execution_table rows 2 and 3 where memory retrieval happens before generating the response.
What happens if memory is empty at the start?
If memory is empty, the system sends the query alone to the retriever or LLM, as shown in execution_table row 3, and then updates memory after getting the response (row 4).
How does memory get updated after the response?
After generating the response, the system stores the question and answer pair in memory for future context, as shown in execution_table row 4 and variable_tracker for 'memory'.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the memory state before the first retrieval?
AContains previous Q&A
BEmpty
CContains only the query
DContains the response
💡 Hint
Check execution_table row 2 under 'Memory State Before'
At which step does the system update the memory with new information?
AStep 4
BStep 3
CStep 2
DStep 5
💡 Hint
Look at execution_table row 4 under 'Action' and 'Memory State After'
If the memory already contained relevant context, how would the combined query change?
AIt would be only the new query
BIt would be empty
CIt would include the query plus memory context
DIt would be only the memory context
💡 Hint
Refer to concept_flow where query and memory context combine before sending to retriever
Concept Snapshot
Memory-augmented retrieval:
- Takes user query
- Retrieves relevant past memory
- Combines query + memory context
- Sends combined input to retriever or LLM
- Gets response
- Updates memory with new Q&A
- Outputs answer
This improves responses by using conversation history.
Full Transcript
Memory-augmented retrieval means the system remembers past conversations or information. When a user asks a question, the system first looks into its memory to find related context. It then combines this context with the new question and sends it to a retriever or language model to get a better answer. After getting the answer, it updates the memory with the new question and answer pair. This way, the system learns and improves over time by using past information to help answer new questions.