Metadata filtering helps you find specific items in a vector store by using extra information about those items. It makes searching faster and more accurate.
0
0
Metadata filtering in vector stores in LangChain
Introduction
You want to search documents but only from a certain date or author.
You have many items and want to narrow results by category or tag.
You want to exclude some items from search results based on their properties.
You want to combine text similarity with specific conditions like language or type.
Syntax
LangChain
results = vector_store.similarity_search(query, filter={"key": "value"})
The filter is a dictionary with metadata keys and values to match.
Only items with metadata matching the filter are returned.
Examples
Search for items about 'climate change' but only from the year 2023.
LangChain
results = vector_store.similarity_search("climate change", filter={"year": "2023"})
Find dessert recipes that are easy to make.
LangChain
results = vector_store.similarity_search("recipe", filter={"category": "dessert", "difficulty": "easy"})
Search machine learning documents only in English.
LangChain
results = vector_store.similarity_search("machine learning", filter={"language": "English"})
Sample Program
This example creates a vector store with three documents, each having metadata about topic and level. It searches for documents related to 'Python' but only those tagged as programming and beginner level. The output shows matching document texts.
LangChain
from langchain.vectorstores import FAISS from langchain.embeddings import OpenAIEmbeddings # Sample documents with metadata documents = [ {"text": "Learn Python basics.", "metadata": {"topic": "programming", "level": "beginner"}}, {"text": "Advanced Python techniques.", "metadata": {"topic": "programming", "level": "advanced"}}, {"text": "Cooking pasta recipes.", "metadata": {"topic": "cooking", "level": "beginner"}} ] # Create embeddings embeddings = OpenAIEmbeddings() # Prepare texts and metadata texts = [doc["text"] for doc in documents] metadatas = [doc["metadata"] for doc in documents] # Create vector store with metadata vector_store = FAISS.from_texts(texts, embeddings, metadatas=metadatas) # Search for programming documents at beginner level results = vector_store.similarity_search("Python", filter={"topic": "programming", "level": "beginner"}) for r in results: print(r.page_content)
OutputSuccess
Important Notes
Filters must match metadata keys exactly as stored.
Not all vector stores support metadata filtering; check your store's documentation.
Filtering helps reduce noise and improves search relevance.
Summary
Metadata filtering narrows search results by extra information.
Use a dictionary with key-value pairs to filter results.
It works well when you want specific categories or properties in your search.