0
0
LangChainframework~30 mins

Hybrid search (keyword + semantic) in LangChain - Mini Project: Build & Apply

Choose your learning style9 modes available
Hybrid Search with Langchain: Keyword + Semantic
📖 Scenario: You are building a smart search feature for a small document collection. Users want to find documents by typing keywords or by meaning (semantic search). Combining both methods gives better results.
🎯 Goal: Create a hybrid search system using Langchain that first filters documents by keyword, then ranks them by semantic similarity.
📋 What You'll Learn
Create a list of documents with exact text content
Define a keyword to filter documents
Use Langchain's embedding model to get semantic vectors
Combine keyword filtering and semantic similarity ranking
Return the top matching documents
💡 Why This Matters
🌍 Real World
Hybrid search is used in apps like document search, customer support, and knowledge bases to find relevant info quickly by combining exact word matches and meaning.
💼 Career
Understanding hybrid search with Langchain is valuable for roles in AI, data science, and software development focused on search engines and natural language processing.
Progress0 / 4 steps
1
Create the document list
Create a list called documents with these exact strings: 'Langchain is a framework for building applications', 'Semantic search finds meaning', 'Keyword search matches exact words', 'Hybrid search combines both methods'.
LangChain
Need a hint?

Use a Python list with the exact strings given.

2
Set the keyword filter
Create a variable called keyword and set it to the string 'search' to filter documents containing this word.
LangChain
Need a hint?

Assign the exact string 'search' to the variable keyword.

3
Filter documents by keyword and embed
Import OpenAIEmbeddings from langchain.embeddings. Create an instance called embedding_model. Filter documents to keep only those containing keyword (case-insensitive) into filtered_docs. Then create a list embedded_docs by applying embedding_model.embed_query() to each document in filtered_docs.
LangChain
Need a hint?

Use list comprehension to filter and embed documents. Remember to import the embedding class first.

4
Rank filtered documents by semantic similarity
Import cosine_similarity from sklearn.metrics.pairwise. Create a variable query_embedding by embedding the string 'hybrid search' using embedding_model.embed_query(). Compute a list similarities by calculating cosine similarity between query_embedding and each vector in embedded_docs. Create a list ranked_docs by sorting filtered_docs in descending order of similarity using similarities. The final code should combine keyword filtering and semantic ranking.
LangChain
Need a hint?

Use cosine similarity to rank documents by meaning. Sort documents by similarity descending.