0
0
LangChainframework~10 mins

Similarity search vs MMR retrieval in LangChain - Visual Side-by-Side Comparison

Choose your learning style9 modes available
Concept Flow - Similarity search vs MMR retrieval
Input Query
Compute Embedding
Find Top K Similar
Return Top K Results
Balance Relevance & Diversity
Return Diverse Results
The flow starts with an input query. Similarity search finds the top similar items directly. MMR retrieval first finds candidates then selects results balancing relevance and diversity.
Execution Sample
LangChain
query = "What is AI?"
results_sim = similarity_search(query, k=3)
results_mmr = mmr_retrieval(query, k=3, lambda_param=0.5)
print(results_sim)
print(results_mmr)
This code runs similarity search and MMR retrieval on the same query, showing different result sets.
Execution Table
StepMethodActionIntermediate ResultOutput
1BothReceive query 'What is AI?'Query embedding computedNone
2Similarity SearchFind top 3 most similar documents by embedding similarityTop 3 docs by similarity scoresDocs A, B, C
3MMR RetrievalFind candidate documents (e.g., top 10 by similarity)Candidate set of 10 docsDocs A-J
4MMR RetrievalIteratively select docs balancing relevance and diversity using lambda=0.5Selected docs after each iterationDocs A, D, F
5BothReturn final resultsSimilarity Search returns top 3 similarDocs A, B, C
6BothReturn final resultsMMR returns diverse top 3 balancing similarity and noveltyDocs A, D, F
7BothEndNo more stepsProcess complete
💡 Both methods finish after selecting k=3 documents; similarity search returns closest only, MMR balances diversity.
Variable Tracker
VariableStartAfter Step 2After Step 4Final
query"What is AI?""What is AI?""What is AI?""What is AI?"
similarity_search_results[][Doc A, Doc B, Doc C][Doc A, Doc B, Doc C][Doc A, Doc B, Doc C]
mmr_candidates[]N/A[Doc A-J][Doc A-J]
mmr_selected[]N/A[Doc A, Doc D, Doc F][Doc A, Doc D, Doc F]
Key Moments - 3 Insights
Why does MMR retrieval select different documents than similarity search?
MMR retrieval balances relevance and diversity by iteratively selecting documents that are both similar to the query and different from already selected ones, as shown in steps 3 and 4 of the execution_table.
Does similarity search consider diversity in results?
No, similarity search only ranks documents by similarity score and returns the top k, so results may be very similar to each other, as seen in step 2.
What role does the lambda parameter play in MMR retrieval?
Lambda controls the trade-off between relevance and diversity during selection in MMR retrieval, affecting which documents are chosen in step 4.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 2, what documents does similarity search return?
ADocs A, D, F
BDocs A, B, C
CDocs A-J
DDocs D, E, F
💡 Hint
Check the 'Output' column for step 2 in the execution_table.
At which step does MMR retrieval balance relevance and diversity?
AStep 4
BStep 3
CStep 2
DStep 5
💡 Hint
Look at the 'Action' column for MMR retrieval in the execution_table.
If lambda in MMR retrieval is set to 1, what would happen to the results?
ANo documents are selected
BResults become more diverse
CResults become identical to similarity search
DResults become random
💡 Hint
Lambda=1 means full weight on relevance; see key_moments about lambda's role.
Concept Snapshot
Similarity Search:
- Finds top k documents by embedding similarity
- Results may be similar to each other

MMR Retrieval:
- Finds candidates then selects k balancing relevance and diversity
- Uses lambda to control trade-off

Use similarity search for pure relevance
Use MMR to get diverse relevant results
Full Transcript
This visual execution compares similarity search and MMR retrieval in langchain. Both start by embedding the input query. Similarity search directly finds the top k most similar documents and returns them. MMR retrieval first finds a larger candidate set, then iteratively selects documents balancing relevance to the query and diversity from already selected documents, controlled by a lambda parameter. The execution table shows step-by-step actions and outputs for both methods. Variable tracking shows how results evolve. Key moments clarify why MMR returns different documents and the role of lambda. The quiz tests understanding of these steps and concepts. The snapshot summarizes when to use each method.