In similarity search, the goal is to find items most like a query. The key metrics are Precision, Recall, and F1 score. Precision tells us how many of the retrieved items are truly similar. Recall tells us how many of the truly similar items we found. F1 balances both. We want high recall to not miss good matches, and high precision to avoid wrong matches. Sometimes, Mean Average Precision (MAP) is used to measure ranking quality. These metrics help us know if the search is accurate and useful.
Similarity search and retrieval in Prompt Engineering / GenAI - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
| Retrieved Similar | Retrieved Not Similar ----------|-------------------|--------------------- Actually Similar | TP | FN Actually Not Similar| FP | TN Where: - TP (True Positive): Correctly retrieved similar items - FP (False Positive): Retrieved items that are not similar - FN (False Negative): Similar items missed by retrieval - TN (True Negative): Items correctly not retrieved Total items = TP + FP + FN + TN Example: If we have 100 items, 30 are truly similar to query. Model retrieves 40 items, 25 are truly similar (TP=25), 15 are not (FP=15). Missed similar items = 5 (FN=5), rest are TN=55. Precision = 25 / (25 + 15) = 0.625 Recall = 25 / (25 + 5) = 0.833
Imagine a photo app that finds similar pictures. If it shows many photos, it may find most similar ones (high recall) but also show wrong ones (low precision). If it shows fewer photos, it may be very sure about them (high precision) but miss some good matches (low recall).
In a music recommendation system, high recall means suggesting many songs you might like, but some may be off. High precision means only suggesting songs you really like, but fewer suggestions.
Choosing between precision and recall depends on what matters more: missing good matches (recall) or showing wrong matches (precision).
- Good: Precision and recall both above 0.8 means most retrieved items are correct and most similar items are found.
- Acceptable: Precision around 0.7 and recall around 0.7 means moderate quality, some errors and misses.
- Bad: Precision below 0.5 or recall below 0.5 means many wrong items retrieved or many similar items missed.
- Mean Average Precision (MAP) close to 1.0 is excellent; near 0.5 is random guessing.
- Accuracy paradox: If most items are not similar, accuracy can be high by always saying "not similar" but this is useless.
- Ignoring recall: High precision but low recall means many good matches are missed.
- Ignoring precision: High recall but low precision means many wrong matches confuse users.
- Data leakage: Using test items in training can inflate metrics falsely.
- Overfitting: Model performs well on known data but poorly on new queries.
Your similarity search model has 98% accuracy but only 12% recall on similar items. Is it good for production? Why or why not?
Answer: No, it is not good. The high accuracy is misleading because most items are not similar, so the model just says "not similar" often. The very low recall means it misses almost all truly similar items, which defeats the purpose of similarity search.
Practice
What is the main goal of similarity search in machine learning?
Solution
Step 1: Understand the purpose of similarity search
Similarity search is used to find items that are similar or close to each other in a dataset.Step 2: Compare options with the definition
Only To find items that are close or alike in a collection describes finding similar or close items, which matches the goal of similarity search.Final Answer:
To find items that are close or alike in a collection -> Option CQuick Check:
Similarity search = find similar items [OK]
- Confusing similarity search with sorting
- Thinking similarity search counts items
- Assuming it removes duplicates
Which of the following is the correct way to compute cosine similarity between two vectors A and B in Python using numpy?
import numpy as np A = np.array([1, 2, 3]) B = np.array([4, 5, 6]) # What code computes cosine similarity?
Solution
Step 1: Recall cosine similarity formula
Cosine similarity = dot product of A and B divided by product of their norms.Step 2: Match formula to code options
np.dot(A, B) / (np.linalg.norm(A) * np.linalg.norm(B)) matches the formula exactly: np.dot(A, B) / (np.linalg.norm(A) * np.linalg.norm(B)).Final Answer:
np.dot(A, B) / (np.linalg.norm(A) * np.linalg.norm(B)) -> Option DQuick Check:
Cosine similarity = dot / (norm A * norm B) [OK]
- Adding norms instead of multiplying
- Subtracting norms in denominator
- Multiplying dot product by sum of norms
Given the following vectors, what is the cosine similarity between vec1 and vec2?
import numpy as np
vec1 = np.array([1, 0, 0])
vec2 = np.array([0, 1, 0])
cos_sim = np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))
print("{:.2f}".format(cos_sim))Solution
Step 1: Calculate dot product of vec1 and vec2
Dot product = 1*0 + 0*1 + 0*0 = 0.Step 2: Calculate norms and cosine similarity
Norm of vec1 = 1, norm of vec2 = 1, so cosine similarity = 0 / (1*1) = 0.Final Answer:
0.00 -> Option AQuick Check:
Orthogonal vectors have cosine similarity 0 [OK]
- Confusing dot product with cosine similarity
- Forgetting to divide by norms
- Rounding errors causing wrong answer
Consider this code snippet for similarity search. What is the error?
import numpy as np
vectors = [np.array([1, 2]), np.array([3, 4])]
query = np.array([1, 0])
scores = []
for v in vectors:
score = np.dot(query, v) / np.linalg.norm(query) * np.linalg.norm(v)
scores.append(score)
print(scores)Solution
Step 1: Analyze the cosine similarity formula in code
The formula should divide dot product by product of norms: dot(query, v) / (norm(query) * norm(v)).Step 2: Identify missing parentheses
Code does np.dot(query, v) / np.linalg.norm(query) * np.linalg.norm(v), which computes division then multiplication separately, causing wrong result.Final Answer:
Missing parentheses causing wrong order of operations -> Option AQuick Check:
Use parentheses to group denominator multiplication [OK]
- Forgetting parentheses around denominator
- Using cross product instead of dot product
- Ignoring vector length mismatch
You have a collection of text documents converted into vectors. You want to find the top 2 most similar documents to a new query vector using cosine similarity. Which approach is best?
- Compute cosine similarity between query and each document vector.
- Sort documents by similarity score descending.
- Return top 2 documents.
Which code snippet correctly implements this?
import numpy as np docs = [np.array([1, 0]), np.array([0, 1]), np.array([1, 1])] query = np.array([1, 0]) # Choose the correct code:
Solution
Step 1: Compute cosine similarity correctly
scores = [np.dot(query, d) / (np.linalg.norm(query) * np.linalg.norm(d)) for d in docs] top2 = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:2] print(top2) computes cosine similarity as dot product divided by product of norms, which is correct.Step 2: Sort indices by similarity descending and select top 2
scores = [np.dot(query, d) / (np.linalg.norm(query) * np.linalg.norm(d)) for d in docs] top2 = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:2] print(top2) sorts indices by scores descending and selects top 2, matching the requirement.Final Answer:
scores = [np.dot(query, d) / (np.linalg.norm(query) * np.linalg.norm(d)) for d in docs] top2 = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:2] print(top2) -> Option BQuick Check:
Cosine similarity + sort descending + top 2 = scores = [np.dot(query, d) / (np.linalg.norm(query) * np.linalg.norm(d)) for d in docs] top2 = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:2] print(top2) [OK]
- Multiplying norms instead of dividing
- Using cross product instead of dot product
- Sorting ascending instead of descending
