How to Create Embeddings with LangChain: Simple Guide
To create embeddings in LangChain, use the
OpenAIEmbeddings class from langchain.embeddings. Instantiate it and call embed_query or embed_documents to convert text into vector embeddings for semantic search or other tasks.Syntax
The main class to create embeddings in LangChain is OpenAIEmbeddings. You first import it, then create an instance. Use embed_query(text) to get an embedding vector for a single text query, or embed_documents(list_of_texts) for multiple texts.
- OpenAIEmbeddings(): Initializes the embedding model.
- embed_query(text): Returns a vector for one text string.
- embed_documents(texts): Returns a list of vectors for multiple texts.
python
from langchain.embeddings import OpenAIEmbeddings # Create embeddings instance embeddings = OpenAIEmbeddings() # Embed a single query vector = embeddings.embed_query("Hello world") # Embed multiple documents vectors = embeddings.embed_documents(["Hello world", "LangChain is great"])
Example
This example shows how to create embeddings for a single sentence and a list of sentences using LangChain's OpenAIEmbeddings. It prints the vector lengths to confirm embeddings were created.
python
from langchain.embeddings import OpenAIEmbeddings # Initialize embeddings embeddings = OpenAIEmbeddings() # Single text embedding single_vector = embeddings.embed_query("LangChain makes working with language models easy.") print(f"Single embedding vector length: {len(single_vector)}") # Multiple texts embedding texts = ["Hello world", "Creating embeddings with LangChain"] multi_vectors = embeddings.embed_documents(texts) print(f"Number of embeddings: {len(multi_vectors)}") print(f"Length of first embedding vector: {len(multi_vectors[0])}")
Output
Single embedding vector length: 1536
Number of embeddings: 2
Length of first embedding vector: 1536
Common Pitfalls
Common mistakes when creating embeddings in LangChain include:
- Not setting up your OpenAI API key in environment variables, causing authentication errors.
- Passing non-string inputs to
embed_queryorembed_documents, which expect strings. - Confusing
embed_query(single string) withembed_documents(list of strings). - Expecting embeddings to be human-readable; embeddings are numeric vectors used internally.
python
from langchain.embeddings import OpenAIEmbeddings embeddings = OpenAIEmbeddings() # Wrong: passing integer instead of string # vector = embeddings.embed_query(123) # This will raise an error # Right: always pass string vector = embeddings.embed_query("123")
Quick Reference
Remember these tips when creating embeddings with LangChain:
- Always install
langchainand set your OpenAI API key asOPENAI_API_KEYin your environment. - Use
OpenAIEmbeddings()to create an embeddings object. - Use
embed_queryfor single texts andembed_documentsfor lists. - Embeddings are numeric vectors used for similarity search, not readable text.
Key Takeaways
Use OpenAIEmbeddings from langchain.embeddings to create embeddings easily.
Call embed_query for single text and embed_documents for multiple texts.
Ensure your OpenAI API key is set in environment variables before running.
Pass only strings to embedding methods to avoid errors.
Embeddings are vectors for machine use, not human-readable text.