The user query is processed by both keyword and semantic search, then results are combined and ranked before showing.
Execution Sample
LangChain
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.retrievers import BM25Retriever
# Assume bm25 (BM25Retriever) and vectorstore (FAISS) are initialized# Perform keyword and semantic search
results_keyword = bm25.get_relevant_documents(query)
results_semantic = vectorstore.similarity_search(query, k=3)
This code runs keyword and semantic searches on the same query and collects top 3 results from each.
Execution Table
Step
Action
Input
Output
Notes
1
Receive user query
"climate change impact"
Query stored
Start with raw user input
2
Run keyword search
"climate change impact"
Top 3 docs matching keywords
Find docs with exact words
3
Run semantic search
"climate change impact"
Top 3 docs by meaning
Find docs with similar meaning
4
Combine results
Keyword + Semantic results
6 docs combined
Merge both result sets
5
Rank combined results
6 docs
Final ranked list
Sort by relevance score
6
Return results
Final ranked list
Displayed to user
Show best matches
7
Exit
N/A
Search complete
Process ends
💡 All steps complete, final ranked results returned to user
Variable Tracker
Variable
Start
After Step 2
After Step 3
After Step 4
After Step 5
Final
query
"climate change impact"
"climate change impact"
"climate change impact"
"climate change impact"
"climate change impact"
"climate change impact"
results_keyword
None
[Doc1, Doc2, Doc3]
[Doc1, Doc2, Doc3]
[Doc1, Doc2, Doc3]
[Doc1, Doc2, Doc3]
[Doc1, Doc2, Doc3]
results_semantic
None
None
[Doc4, Doc5, Doc6]
[Doc4, Doc5, Doc6]
[Doc4, Doc5, Doc6]
[Doc4, Doc5, Doc6]
combined_results
None
None
None
[Doc1, Doc2, Doc3, Doc4, Doc5, Doc6]
[Doc1, Doc2, Doc3, Doc4, Doc5, Doc6]
[Doc1, Doc2, Doc3, Doc4, Doc5, Doc6]
final_results
None
None
None
None
[Doc2, Doc4, Doc1, Doc5, Doc3, Doc6]
[Doc2, Doc4, Doc1, Doc5, Doc3, Doc6]
Key Moments - 3 Insights
Why do we run both keyword and semantic searches instead of just one?
Because keyword search finds exact word matches (see Step 2 in execution_table), while semantic search finds documents with similar meaning even if words differ (Step 3). Combining both gives better coverage.
How are the combined results ranked before returning?
After merging keyword and semantic results (Step 4), they are ranked by relevance scores that consider both keyword matches and semantic similarity (Step 5). This ensures the best overall matches appear first.
What happens if the same document appears in both keyword and semantic results?
Duplicates are removed during combination (Step 4), so each document appears only once in the final ranked list.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the output after Step 3?
ATop 3 docs matching keywords
BTop 3 docs by meaning
CFinal ranked list
DCombined 6 docs
💡 Hint
Check the Output column for Step 3 in execution_table
At which step are the keyword and semantic results combined?
AStep 4
BStep 3
CStep 2
DStep 5
💡 Hint
Look for the action 'Combine results' in execution_table
If the query changes, which variable in variable_tracker updates first?
Aresults_keyword
Bcombined_results
Cquery
Dfinal_results
💡 Hint
See variable_tracker, 'query' is set at Start and remains constant
Concept Snapshot
Hybrid search combines keyword and semantic search.
Keyword search finds exact word matches.
Semantic search finds meaning-based matches.
Results are merged and ranked by relevance.
This approach improves search accuracy and recall.
Full Transcript
Hybrid search in Langchain means using both keyword and semantic search on the same user query. First, the query is taken as input. Then keyword search finds documents containing the exact words. Semantic search finds documents with similar meaning even if words differ. Both result sets are combined and duplicates removed. The combined list is ranked by relevance scores considering both methods. Finally, the best results are returned to the user. This method improves finding relevant documents by covering both exact matches and related meanings.