0
0
LangChainframework~8 mins

Similarity search vs MMR retrieval in LangChain - Performance Comparison

Choose your learning style9 modes available
Performance: Similarity search vs MMR retrieval
MEDIUM IMPACT
This concept affects how quickly and efficiently relevant documents are retrieved and displayed to users, impacting interaction responsiveness and perceived speed.
Retrieving relevant documents for a user query
LangChain
results = vectorstore.max_marginal_relevance_search(query, k=10, fetch_k=20)
MMR retrieval balances relevance and diversity by re-ranking results, reducing redundancy and improving user experience.
📈 Performance Gainslightly higher CPU cost but better user interaction speed due to more meaningful results
Retrieving relevant documents for a user query
LangChain
results = vectorstore.similarity_search(query, k=10)
Simple similarity search returns top-k closest vectors but may include redundant or very similar documents, reducing result diversity.
📉 Performance Costfast retrieval with minimal computation but may cause user to spend more time scanning redundant results
Performance Comparison
PatternComputation CostResponse TimeResult DiversityVerdict
Similarity SearchLow (simple top-k vector similarity)Faster (direct retrieval)Low (may have redundant results)[OK]
MMR RetrievalMedium (similarity + re-ranking)Slightly slower (extra CPU work)High (diverse, less redundancy)[OK] Good
Rendering Pipeline
The query triggers vector similarity calculations; similarity search directly returns top matches, while MMR retrieval adds a re-ranking step to optimize diversity before results render.
JavaScript Execution
Network Request
Rendering
⚠️ BottleneckJavaScript Execution due to extra re-ranking computations in MMR
Core Web Vital Affected
INP
This concept affects how quickly and efficiently relevant documents are retrieved and displayed to users, impacting interaction responsiveness and perceived speed.
Optimization Tips
1Similarity search is faster but may return redundant results.
2MMR retrieval improves result diversity at a small CPU cost.
3Limit fetch_k and cache embeddings to optimize MMR performance.
Performance Quiz - 3 Questions
Test your performance knowledge
Which retrieval method typically uses more CPU time during query processing?
ASimple similarity search
BMMR retrieval with re-ranking
CBoth use the same CPU time
DNeither uses CPU time
DevTools: Performance
How to check: Record a performance profile while triggering a query; compare CPU time spent in JavaScript execution for similarity search vs MMR retrieval.
What to look for: Look for longer scripting time and call stacks related to re-ranking logic in MMR retrieval, indicating extra computation.