How to Use Chroma with Langchain: Simple Guide
To use
Chroma with langchain, first install the chromadb package, then create a Chroma vector store instance with your documents and embeddings. Use this vector store in Langchain's retriever or chain to perform fast similarity search and retrieval.Syntax
The basic syntax to use Chroma with Langchain involves creating a Chroma vector store by passing your documents and an embedding model. Then, you can create a retriever from this store to query similar documents.
- Chroma: The vector database instance.
- embedding_function: The model that converts text to vectors.
- documents: The list of text documents to index.
- retriever: Used to search the vector store.
python
from langchain.vectorstores import Chroma from langchain.embeddings.openai import OpenAIEmbeddings # Initialize embeddings embedding_function = OpenAIEmbeddings() # Create Chroma vector store with documents vectordb = Chroma.from_texts(texts=['doc1', 'doc2'], embedding=embedding_function) # Create retriever retriever = vectordb.as_retriever() # Use retriever to get relevant docs results = retriever.get_relevant_documents('query text')
Example
This example shows how to create a Chroma vector store with sample documents, then query it to find the most relevant document using Langchain's retriever.
python
from langchain.vectorstores import Chroma from langchain.embeddings.openai import OpenAIEmbeddings # Sample documents texts = [ "Langchain helps build LLM apps.", "Chroma is a vector database.", "Python is great for AI development." ] # Initialize embedding model embedding_function = OpenAIEmbeddings() # Create Chroma vector store vectordb = Chroma.from_texts(texts=texts, embedding=embedding_function) # Create retriever retriever = vectordb.as_retriever() # Query the retriever query = "What helps build applications with large language models?" results = retriever.get_relevant_documents(query) # Print the most relevant document print(results[0].page_content)
Output
Langchain helps build LLM apps.
Common Pitfalls
Common mistakes when using Chroma with Langchain include:
- Not installing the
chromadbpackage, causing import errors. - Forgetting to provide an embedding function, which is required to convert text to vectors.
- Passing empty or invalid documents to
Chroma.from_texts, resulting in no data indexed. - Not calling
as_retriever()before querying, which leads to attribute errors.
python
from langchain.vectorstores import Chroma # Wrong: Missing embedding function # vectordb = Chroma.from_texts(texts=['doc1', 'doc2']) # This will fail # Right: Provide embedding function from langchain.embeddings.openai import OpenAIEmbeddings embedding_function = OpenAIEmbeddings() vectordb = Chroma.from_texts(texts=['doc1', 'doc2'], embedding=embedding_function)
Quick Reference
| Step | Description |
|---|---|
| Install chromadb | pip install chromadb |
| Import Chroma and embeddings | from langchain.vectorstores import Chroma; from langchain.embeddings.openai import OpenAIEmbeddings |
| Create embedding model | embedding_function = OpenAIEmbeddings() |
| Create vector store | vectordb = Chroma.from_texts(texts, embedding=embedding_function) |
| Create retriever | retriever = vectordb.as_retriever() |
| Query retriever | results = retriever.get_relevant_documents(query) |
Key Takeaways
Always provide an embedding function when creating a Chroma vector store in Langchain.
Use the retriever interface from Chroma to perform similarity searches on your documents.
Install the chromadb package before using Chroma to avoid import errors.
Pass meaningful text documents to index for effective retrieval results.
Call as_retriever() on the Chroma instance before querying for documents.