0
0
LangchainHow-ToBeginner ยท 3 min read

How to Create Embeddings with LangChain: Simple Guide

To create embeddings in LangChain, use the OpenAIEmbeddings class from langchain.embeddings. Instantiate it and call embed_query or embed_documents to convert text into vector embeddings for semantic search or other tasks.
๐Ÿ“

Syntax

The main class to create embeddings in LangChain is OpenAIEmbeddings. You first import it, then create an instance. Use embed_query(text) to get an embedding vector for a single text query, or embed_documents(list_of_texts) for multiple texts.

  • OpenAIEmbeddings(): Initializes the embedding model.
  • embed_query(text): Returns a vector for one text string.
  • embed_documents(texts): Returns a list of vectors for multiple texts.
python
from langchain.embeddings import OpenAIEmbeddings

# Create embeddings instance
embeddings = OpenAIEmbeddings()

# Embed a single query
vector = embeddings.embed_query("Hello world")

# Embed multiple documents
vectors = embeddings.embed_documents(["Hello world", "LangChain is great"])
๐Ÿ’ป

Example

This example shows how to create embeddings for a single sentence and a list of sentences using LangChain's OpenAIEmbeddings. It prints the vector lengths to confirm embeddings were created.

python
from langchain.embeddings import OpenAIEmbeddings

# Initialize embeddings
embeddings = OpenAIEmbeddings()

# Single text embedding
single_vector = embeddings.embed_query("LangChain makes working with language models easy.")
print(f"Single embedding vector length: {len(single_vector)}")

# Multiple texts embedding
texts = ["Hello world", "Creating embeddings with LangChain"]
multi_vectors = embeddings.embed_documents(texts)
print(f"Number of embeddings: {len(multi_vectors)}")
print(f"Length of first embedding vector: {len(multi_vectors[0])}")
Output
Single embedding vector length: 1536 Number of embeddings: 2 Length of first embedding vector: 1536
โš ๏ธ

Common Pitfalls

Common mistakes when creating embeddings in LangChain include:

  • Not setting up your OpenAI API key in environment variables, causing authentication errors.
  • Passing non-string inputs to embed_query or embed_documents, which expect strings.
  • Confusing embed_query (single string) with embed_documents (list of strings).
  • Expecting embeddings to be human-readable; embeddings are numeric vectors used internally.
python
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

# Wrong: passing integer instead of string
# vector = embeddings.embed_query(123)  # This will raise an error

# Right: always pass string
vector = embeddings.embed_query("123")
๐Ÿ“Š

Quick Reference

Remember these tips when creating embeddings with LangChain:

  • Always install langchain and set your OpenAI API key as OPENAI_API_KEY in your environment.
  • Use OpenAIEmbeddings() to create an embeddings object.
  • Use embed_query for single texts and embed_documents for lists.
  • Embeddings are numeric vectors used for similarity search, not readable text.
โœ…

Key Takeaways

Use OpenAIEmbeddings from langchain.embeddings to create embeddings easily.
Call embed_query for single text and embed_documents for multiple texts.
Ensure your OpenAI API key is set in environment variables before running.
Pass only strings to embedding methods to avoid errors.
Embeddings are vectors for machine use, not human-readable text.