LangChainframework~30 mins

Why embeddings capture semantic meaning in LangChain - See It in Action

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Why embeddings capture semantic meaning

📖 Scenario: You want to understand how text embeddings can capture the meaning of words and sentences. Imagine you have a list of simple sentences, and you want to convert them into numbers that show how similar their meanings are.

🎯 Goal: Build a small Python program using Langchain to create embeddings for sentences and compare their similarity scores. This will help you see how embeddings capture semantic meaning.

📋 What You'll Learn

Create a list of sentences called sentences with exact values

Create a variable called embedding_model to hold the embedding model name

Use Langchain's OpenAIEmbeddings to generate embeddings for the sentences

Calculate cosine similarity between the first sentence embedding and the others

💡 Why This Matters

🌍 Real World

Embeddings help computers understand the meaning of text, which is useful in search engines, chatbots, and recommendation systems.

💼 Career

Knowing how to generate and compare embeddings is important for roles in AI, data science, and software development involving natural language processing.

Progress0 / 4 steps

Create the sentences list

Create a list called sentences with these exact strings: 'I love apples', 'Apples are my favorite fruit', 'I enjoy eating bananas', 'Bananas are yellow'.

LangChain

# Your code here

Need a hint?

Use square brackets to create a list and include the exact sentences as strings.

Set the embedding model

Create a variable called embedding_model and set it to the string 'text-embedding-ada-002' which is the OpenAI embedding model name.

LangChain

sentences = ['I love apples', 'Apples are my favorite fruit', 'I enjoy eating bananas', 'Bananas are yellow']
# Your code here

Need a hint?

Assign the exact string to the variable embedding_model.

Generate embeddings for sentences

Import OpenAIEmbeddings from langchain.embeddings. Create an instance called embedder using embedding_model. Then create a list called embeddings by applying embedder.embed_documents(sentences).

LangChain

sentences = ['I love apples', 'Apples are my favorite fruit', 'I enjoy eating bananas', 'Bananas are yellow']
embedding_model = 'text-embedding-ada-002'
# Your code here

Need a hint?

Remember to import the class first, then create the embedder instance, and finally generate embeddings for the sentences list.

Calculate similarity with first sentence

Import cosine_similarity from sklearn.metrics.pairwise. Create a list called similarities by calculating the cosine similarity between the first embedding and each embedding in embeddings. Use a list comprehension with cosine_similarity([embeddings[0]], [e])[0][0] for each e in embeddings.

LangChain

from langchain.embeddings import OpenAIEmbeddings

sentences = ['I love apples', 'Apples are my favorite fruit', 'I enjoy eating bananas', 'Bananas are yellow']
embedding_model = 'text-embedding-ada-002'

embedder = OpenAIEmbeddings(model=embedding_model)
embeddings = embedder.embed_documents(sentences)
# Your code here

Need a hint?

Use cosine similarity to compare the first sentence embedding with each embedding in the list.