LangchainComparisonBeginner · 4 min read

Buffer vs Summary vs Window Memory in Langchain: Key Differences and Usage

In Langchain, BufferMemory stores all past interactions as-is, SummaryMemory compresses past interactions into a concise summary, and WindowMemory keeps only the most recent interactions within a fixed window size. Each type manages conversation history differently to balance detail and memory size.

⚖️

Quick Comparison

Here is a quick table comparing the key features of Buffer, Summary, and Window memory types in Langchain.

Feature	BufferMemory	SummaryMemory	WindowMemory
Storage Method	Stores all past messages verbatim	Stores a summarized version of past messages	Stores only recent messages within a fixed limit
Memory Size	Grows indefinitely	Compact and fixed size	Fixed size based on window length
Context Detail	Full detail of conversation	Condensed context, less detail	Recent context only, detailed
Use Case	When full history is needed	When memory size must be small	When recent context is most relevant
Performance Impact	Can slow down with large history	Efficient for long chats	Efficient but limited context
Example Scenario	Chatbots needing full recall	Long conversations with summary	Short-term memory focus

⚖️

Key Differences

BufferMemory keeps every message in the conversation exactly as it happened. This means it can grow very large over time, which might slow down processing but ensures no detail is lost.

SummaryMemory uses a language model to create a short summary of past messages. This keeps memory size small and efficient but sacrifices some detail for brevity. It is great for long conversations where only the gist is needed.

WindowMemory stores only the last few messages up to a set limit. It keeps recent context detailed but forgets older parts of the conversation. This is useful when only the latest interactions matter, like quick back-and-forth chats.

⚖️

Code Comparison

Here is how to use BufferMemory in Langchain to store full conversation history.

python

from langchain.memory import BufferMemory

memory = BufferMemory()
memory.save_context({'input': 'Hello'}, {'output': 'Hi there!'})
memory.save_context({'input': 'How are you?'}, {'output': 'I am fine, thanks!'})

print(memory.load_memory_variables({}))

Output

{'history': 'Human: Hello\nAI: Hi there!\nHuman: How are you?\nAI: I am fine, thanks!\n'}

↔️

SummaryMemory Equivalent

Here is how to use SummaryMemory in Langchain to keep a summarized conversation history.

python

from langchain.memory import ConversationSummaryMemory
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
memory = ConversationSummaryMemory(llm=llm)
memory.save_context({'input': 'Hello'}, {'output': 'Hi there!'})
memory.save_context({'input': 'How are you?'}, {'output': 'I am fine, thanks!'})

print(memory.load_memory_variables({}))

Output

{"history": "The conversation so far: The human greeted the AI and asked how it was doing. The AI responded politely."}

🎯

When to Use Which

Choose BufferMemory when you need to keep every detail of the conversation for accurate context or debugging.

Choose SummaryMemory when you want to handle long conversations efficiently by summarizing past interactions to save memory and speed.

Choose WindowMemory when your application only needs recent context, such as quick chats or commands, and older history can be discarded.

✅

Key Takeaways

BufferMemory stores full conversation history verbatim, growing indefinitely.

SummaryMemory compresses past messages into a concise summary to save space.

WindowMemory keeps only the most recent messages within a fixed window size.

Use BufferMemory for full detail, SummaryMemory for long chats, and WindowMemory for recent context.

Choosing the right memory type balances detail retention and performance.