LangChainframework~10 mins

Hybrid search (keyword + semantic) in LangChain - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Concept Flow - Hybrid search (keyword + semantic)

User Query Input

↓

Keyword Search

↓

Semantic Search

↓

Combine Results

↓

Rank & Return Final Results

The user query is processed by both keyword and semantic search, then results are combined and ranked before showing.

Execution Sample

LangChain

from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.retrievers import BM25Retriever

# Assume bm25 (BM25Retriever) and vectorstore (FAISS) are initialized
# Perform keyword and semantic search
results_keyword = bm25.get_relevant_documents(query)
results_semantic = vectorstore.similarity_search(query, k=3)

This code runs keyword and semantic searches on the same query and collects top 3 results from each.

Execution Table

Step	Action	Input	Output	Notes
1	Receive user query	"climate change impact"	Query stored	Start with raw user input
2	Run keyword search	"climate change impact"	Top 3 docs matching keywords	Find docs with exact words
3	Run semantic search	"climate change impact"	Top 3 docs by meaning	Find docs with similar meaning
4	Combine results	Keyword + Semantic results	6 docs combined	Merge both result sets
5	Rank combined results	6 docs	Final ranked list	Sort by relevance score
6	Return results	Final ranked list	Displayed to user	Show best matches
7	Exit	N/A	Search complete	Process ends

💡 All steps complete, final ranked results returned to user

Variable Tracker

Variable	Start	After Step 2	After Step 3	After Step 4	After Step 5	Final
query	"climate change impact"	"climate change impact"	"climate change impact"	"climate change impact"	"climate change impact"	"climate change impact"
results_keyword	None	[Doc1, Doc2, Doc3]	[Doc1, Doc2, Doc3]	[Doc1, Doc2, Doc3]	[Doc1, Doc2, Doc3]	[Doc1, Doc2, Doc3]
results_semantic	None	None	[Doc4, Doc5, Doc6]	[Doc4, Doc5, Doc6]	[Doc4, Doc5, Doc6]	[Doc4, Doc5, Doc6]
combined_results	None	None	None	[Doc1, Doc2, Doc3, Doc4, Doc5, Doc6]	[Doc1, Doc2, Doc3, Doc4, Doc5, Doc6]	[Doc1, Doc2, Doc3, Doc4, Doc5, Doc6]
final_results	None	None	None	None	[Doc2, Doc4, Doc1, Doc5, Doc3, Doc6]	[Doc2, Doc4, Doc1, Doc5, Doc3, Doc6]

Key Moments - 3 Insights

Why do we run both keyword and semantic searches instead of just one?

How are the combined results ranked before returning?

What happens if the same document appears in both keyword and semantic results?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table, what is the output after Step 3?

ATop 3 docs matching keywords

BTop 3 docs by meaning

CFinal ranked list

DCombined 6 docs

Concept Snapshot

Hybrid search combines keyword and semantic search.
Keyword search finds exact word matches.
Semantic search finds meaning-based matches.
Results are merged and ranked by relevance.
This approach improves search accuracy and recall.

Full Transcript

Hybrid search in Langchain means using both keyword and semantic search on the same user query. First, the query is taken as input. Then keyword search finds documents containing the exact words. Semantic search finds documents with similar meaning even if words differ. Both result sets are combined and duplicates removed. The combined list is ranked by relevance scores considering both methods. Finally, the best results are returned to the user. This method improves finding relevant documents by covering both exact matches and related meanings.