Similarity search helps find items close to your query. MMR retrieval balances closeness and variety to avoid repeats.
Similarity search vs MMR retrieval in LangChain
from langchain.vectorstores import FAISS # Similarity search results = vectorstore.similarity_search(query, k=5) # MMR retrieval results = vectorstore.max_marginal_relevance_search(query, k=5, fetch_k=10)
similarity_search returns the top k closest matches to the query.
max_marginal_relevance_search returns k results balancing similarity and diversity, using fetch_k candidates internally.
results = vectorstore.similarity_search('What is AI?', k=3)
results = vectorstore.max_marginal_relevance_search('What is AI?', k=3, fetch_k=6)
results = vectorstore.similarity_search('Nonexistent topic', k=3)
results = vectorstore.max_marginal_relevance_search('Nonexistent topic', k=3, fetch_k=5)
This program creates a small vector store from example sentences about AI. It runs both similarity search and MMR retrieval for the query 'Tell me about AI'. It prints the top 3 results from each method so you can see the difference.
from langchain.vectorstores import FAISS from langchain.embeddings import OpenAIEmbeddings # Sample documents documents = [ 'AI is the simulation of human intelligence.', 'Machine learning is a subset of AI.', 'Deep learning uses neural networks.', 'AI can be used in healthcare.', 'Neural networks mimic the brain.' ] # Create embeddings embeddings = OpenAIEmbeddings() # Build vector store vectorstore = FAISS.from_texts(documents, embeddings) query = 'Tell me about AI' # Similarity search similar_results = vectorstore.similarity_search(query, k=3) # MMR retrieval mmr_results = vectorstore.max_marginal_relevance_search(query, k=3, fetch_k=5) print('Similarity Search Results:') for doc in similar_results: print('-', doc.page_content) print('\nMMR Retrieval Results:') for doc in mmr_results: print('-', doc.page_content)
Similarity search is fast and finds the closest matches but can return very similar or repeated results.
MMR retrieval adds diversity by balancing similarity and novelty, which is useful when you want varied answers.
MMR is slightly slower because it fetches more candidates internally before picking diverse results.
Similarity search finds the closest matches to your query.
MMR retrieval balances closeness and diversity to avoid repeated or very similar results.
Use similarity search for quick, relevant results; use MMR when you want variety in answers.