0
0
Prompt Engineering / GenAIml~8 mins

Why embeddings capture semantic meaning in Prompt Engineering / GenAI - Why Metrics Matter

Choose your learning style9 modes available
Metrics & Evaluation - Why embeddings capture semantic meaning
Which metric matters for this concept and WHY

When we talk about embeddings capturing semantic meaning, the key metric is cosine similarity. This metric measures how close two vectors are in direction, regardless of their length. Since embeddings are vectors representing words or sentences, cosine similarity tells us how similar their meanings are. A higher cosine similarity means the embeddings share more semantic meaning.

Confusion matrix or equivalent visualization (ASCII)
Example: Comparing embeddings of words "cat", "dog", and "car" using cosine similarity

          cat     dog     car
cat     1.00    0.85    0.10
dog     0.85    1.00    0.12
car     0.10    0.12    1.00

Here, "cat" and "dog" have high similarity (0.85), showing semantic closeness.
"cat" and "car" have low similarity (0.10), showing different meanings.
Precision vs Recall tradeoff with concrete examples

In semantic search or recommendation systems using embeddings, precision means how many of the retrieved items are truly relevant (semantically close). Recall means how many of all relevant items were found.

For example, if you search for "apple" meaning the fruit, high precision means most results are about fruit, not the company. High recall means you find most fruit-related items.

Sometimes increasing recall (finding more related items) lowers precision (some unrelated items appear). Balancing these depends on the application.

What "good" vs "bad" metric values look like for this use case

A good embedding model will have:

  • High cosine similarity (close to 1) for semantically similar words or sentences.
  • Low cosine similarity (close to 0 or negative) for unrelated meanings.

A bad model might show high similarity for unrelated words, confusing meanings, or low similarity for synonyms.

Metrics pitfalls (accuracy paradox, data leakage, overfitting indicators)
  • Ignoring vector length: Using Euclidean distance instead of cosine similarity can mislead semantic closeness.
  • Overfitting embeddings: Embeddings trained on small data may memorize instead of generalizing meaning.
  • Data leakage: If test words appear in training, similarity scores may be artificially high.
  • Ignoring context: Static embeddings ignore word meaning changes in sentences, lowering real semantic capture.
Self-check question

Your embedding model shows cosine similarity of 0.95 between "bank" (financial) and "river". Is this good? Why or why not?

Answer: No, this is not good. "Bank" and "river" have different meanings here. High similarity means the model confuses meanings and does not capture semantic differences well.

Key Result
Cosine similarity is the key metric showing how well embeddings capture semantic meaning by measuring vector closeness.