Bird
Raised Fist0
Agentic AIml~8 mins

Embedding models for semantic search in Agentic AI - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Embedding models for semantic search
Which metric matters for embedding models in semantic search and WHY

For embedding models used in semantic search, the key metric is Recall@K. This measures how often the correct or relevant items appear in the top K search results. It matters because users want the right answers to show up quickly, not buried deep in the list.

Another important metric is Mean Reciprocal Rank (MRR), which captures how high the first relevant result appears. A higher MRR means users find what they want faster.

Precision is less important here because semantic search focuses on finding all relevant items, not just avoiding irrelevant ones.

Confusion matrix or equivalent visualization

Semantic search evaluation often uses a ranking table instead of a confusion matrix. Here is a simple example for one query:

Query: "apple fruit benefits"

Rank | Document ID | Relevant?
-----|-------------|----------
1    | Doc_5       | Yes (TP)
2    | Doc_12      | No (FP)
3    | Doc_3       | Yes (TP)
4    | Doc_7       | No (FP)
5    | Doc_9       | No (FP)

Total relevant docs in dataset: 3
Relevant docs retrieved in top 5: 2
Recall@5 = 2/3 = 0.67

This shows how many relevant documents appear in the top results, which is what recall@K measures.

Precision vs Recall tradeoff with concrete examples

In semantic search, recall is usually more important than precision. For example:

  • High recall, lower precision: The search returns many results including most relevant ones, but also some irrelevant. This is good if users want to see all possible answers.
  • High precision, lower recall: The search returns only very confident results but misses some relevant ones. This might frustrate users who want a complete answer.

For example, a medical literature search should have high recall to avoid missing important studies, even if some irrelevant papers appear.

What "good" vs "bad" metric values look like for semantic search

Good values:

  • Recall@10 above 0.8 means most relevant items appear in the top 10 results.
  • MRR above 0.7 means relevant results appear near the top.

Bad values:

  • Recall@10 below 0.4 means many relevant items are missed in top results.
  • MRR below 0.3 means relevant results appear too far down the list.

Low recall frustrates users because they miss important information. Low MRR means users spend more time scrolling.

Common pitfalls in metrics for embedding semantic search
  • Ignoring recall: Focusing only on precision can hide that many relevant results are missed.
  • Data leakage: If test queries or documents appear in training, metrics look artificially high.
  • Overfitting: Model performs well on test data but poorly on new queries, showing unstable recall or MRR.
  • Using accuracy: Accuracy is not meaningful for ranking tasks like semantic search.
Self-check question

Your embedding model for semantic search has 98% accuracy but only 12% recall@10 on relevant documents. Is it good for production? Why or why not?

Answer: No, it is not good. High accuracy here is misleading because most documents are irrelevant, so the model can guess irrelevant and be right often. But 12% recall@10 means it finds very few relevant results in the top 10, so users will miss important information.

Key Result
Recall@K and Mean Reciprocal Rank (MRR) are key metrics to ensure relevant results appear high in semantic search.

Practice

(1/5)
1. What is the main purpose of embedding models in semantic search?
easy
A. To convert text into numbers that capture meaning
B. To count the number of words in a text
C. To translate text into another language
D. To remove stop words from text

Solution

  1. Step 1: Understand embedding models

    Embedding models transform text into numerical vectors that represent the meaning of the text.
  2. Step 2: Identify the purpose in semantic search

    These vectors help find texts with similar meanings, even if the exact words differ.
  3. Final Answer:

    To convert text into numbers that capture meaning -> Option A
  4. Quick Check:

    Embedding models = convert text to meaningful numbers [OK]
Hint: Embedding models turn words into meaningful numbers [OK]
Common Mistakes:
  • Thinking embeddings count words
  • Confusing embeddings with translation
  • Believing embeddings remove words
2. Which of the following is the correct way to get an embedding vector for a text using a model called embed_model in Python?
easy
A. embedding = embed_model.get_embedding('sample text')
B. embedding = embed_model.text_to_vector('sample text')
C. embedding = embed_model.encode('sample text')
D. embedding = embed_model.vectorize('sample text')

Solution

  1. Step 1: Recall common embedding method names

    Many embedding libraries use encode to convert text to vectors.
  2. Step 2: Check method correctness

    Only embed_model.encode('sample text') is a standard and valid call; others are not typical method names.
  3. Final Answer:

    embedding = embed_model.encode('sample text') -> Option C
  4. Quick Check:

    Use encode() to get embeddings [OK]
Hint: Use encode() method to get embeddings [OK]
Common Mistakes:
  • Using non-existent methods like text_to_vector
  • Confusing method names
  • Forgetting to call the method with parentheses
3. Given the following Python code using an embedding model, what will be the output type of embedding?
embedding = embed_model.encode('Find similar texts')
medium
A. A list of words
B. A numeric vector (list or array) representing the text
C. A string representing the text
D. A dictionary with word counts

Solution

  1. Step 1: Understand what encode() returns

    The encode() method returns a numeric vector that captures the meaning of the input text.
  2. Step 2: Identify the output type

    This vector is usually a list or array of numbers, not words, strings, or dictionaries.
  3. Final Answer:

    A numeric vector (list or array) representing the text -> Option B
  4. Quick Check:

    encode() output = numeric vector [OK]
Hint: Embedding output is always numeric vector [OK]
Common Mistakes:
  • Expecting a list of words
  • Thinking output is a string
  • Confusing embeddings with word counts
4. You wrote this code to get embeddings but get an error:
embedding = embed_model.encode['text to search']
What is the error and how to fix it?
medium
A. Add a return statement before encode
B. Change 'text to search' to a list of words
C. Remove the encode method and use embed_model directly
D. Use parentheses () instead of brackets [] to call encode method

Solution

  1. Step 1: Identify the syntax error

    Methods in Python are called with parentheses (), not brackets []. Using brackets causes a TypeError.
  2. Step 2: Correct the method call

    Replace encode['text to search'] with encode('text to search') to fix the error.
  3. Final Answer:

    Use parentheses () instead of brackets [] to call encode method -> Option D
  4. Quick Check:

    Method calls need () not [] [OK]
Hint: Call methods with () not [] [OK]
Common Mistakes:
  • Using brackets [] instead of parentheses ()
  • Passing wrong argument types
  • Trying to call method without parentheses
5. You want to build a semantic search system that finds documents similar in meaning to a query. Which approach best uses embedding models for this task?
hard
A. Convert all documents and the query to embeddings, then find documents with closest vectors
B. Count keyword frequency in documents and query, then match counts
C. Translate documents to another language before searching
D. Sort documents alphabetically and pick the first matches

Solution

  1. Step 1: Understand semantic search with embeddings

    Semantic search uses embeddings to represent meaning, so comparing vectors finds similar meaning.
  2. Step 2: Identify the correct approach

    Converting documents and query to embeddings and finding closest vectors is the correct method for semantic search.
  3. Final Answer:

    Convert all documents and the query to embeddings, then find documents with closest vectors -> Option A
  4. Quick Check:

    Semantic search = compare embedding vectors [OK]
Hint: Compare embeddings of query and documents for semantic search [OK]
Common Mistakes:
  • Using keyword counts instead of embeddings
  • Translating text unnecessarily
  • Sorting alphabetically instead of by meaning