Practice

(1/5)

1. Which vector similarity metric measures the angle between two vectors to determine how similar they are?

easy

A. Manhattan distance

B. Euclidean distance

C. Cosine similarity

D. Jaccard similarity

Solution

Step 1: Understand cosine similarity
Cosine similarity measures the cosine of the angle between two vectors, showing how aligned they are.
Step 2: Compare with other metrics
Euclidean and Manhattan distances measure gaps, not angles. Jaccard is for sets, not vectors.
Final Answer:
Cosine similarity -> Option C
Quick Check:
Angle-based similarity = Cosine similarity [OK]

Hint: Angle means cosine similarity, distance means Euclidean [OK]

Common Mistakes:

Confusing distance with angle measurement
Thinking Euclidean measures angle
Mixing set similarity with vector similarity

2. Which of the following is the correct Python expression to compute cosine similarity between two vectors a and b using numpy?

easy

A. np.linalg.norm(a - b)

B. np.dot(a, b) * (np.linalg.norm(a) + np.linalg.norm(b))

C. np.sum(np.abs(a - b))

D. np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

Solution

Step 1: Recall cosine similarity formula
Cosine similarity = dot product of vectors divided by product of their lengths (norms).
Step 2: Match formula to code
np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) matches this formula exactly. np.linalg.norm(a - b) is Euclidean distance, C is Manhattan distance, D is incorrect formula.
Final Answer:
np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) -> Option D
Quick Check:
Dot product over norms = cosine similarity [OK]

Hint: Cosine = dot product divided by norms product [OK]

Common Mistakes:

Using subtraction instead of dot product
Multiplying norms instead of dividing
Confusing Euclidean with cosine formula

3. Given vectors a = np.array([1, 2, 3]) and b = np.array([4, 5, 6]), what is the output of np.linalg.norm(a - b)?

medium

A. 3.742

B. 5.196

C. 15.0

D. 32.0

Solution

Step 1: Calculate vector difference
a - b = [1-4, 2-5, 3-6] = [-3, -3, -3]
Step 2: Compute Euclidean norm
Norm = sqrt((-3)^2 + (-3)^2 + (-3)^2) = sqrt(9+9+9) = sqrt(27) ≈ 5.196
Final Answer:
5.196 -> Option B
Quick Check:
Euclidean distance = 5.196 [OK]

Hint: Euclidean norm = sqrt(sum of squared differences) [OK]

Common Mistakes:

Forgetting to square differences
Calculating sum instead of sqrt of sum
Mixing up vector subtraction order

4. Identify the error in this Python code snippet for cosine similarity:

import numpy as np

def cosine_sim(a, b):
    return np.dot(a, b) / np.linalg.norm(a) + np.linalg.norm(b)

print(cosine_sim(np.array([1, 0]), np.array([0, 1])))

medium

A. The denominator should multiply norms, not add them

B. np.dot is used incorrectly; should be np.cross

C. Vectors must be normalized before dot product

D. Function is missing return statement

Solution

Step 1: Analyze denominator in formula
The code adds norms: np.linalg.norm(a) + np.linalg.norm(b), but cosine similarity divides by their product.
Step 2: Understand correct formula
Cosine similarity = dot(a,b) / (norm(a) * norm(b)), so addition is wrong here.
Final Answer:
The denominator should multiply norms, not add them -> Option A
Quick Check:
Denominator = product of norms [OK]

Hint: Denominator in cosine similarity multiplies norms [OK]

Common Mistakes:

Adding norms instead of multiplying
Using cross product instead of dot product
Forgetting to return value

5. You have two text documents represented as vectors: doc1 = [1, 0, 2, 1] and doc2 = [0, 1, 1, 1]. Which similarity metric is best to find how similar their topics are, and why?

hard

A. Cosine similarity, because it measures angle ignoring length differences

B. Euclidean distance, because it measures exact gap between vectors

C. Manhattan distance, because it sums absolute differences

D. Jaccard similarity, because it compares set overlap

Solution

Step 1: Understand vector meaning in text
Vectors represent word counts or weights; length can vary by document size.
Step 2: Choose metric ignoring length but capturing direction
Cosine similarity measures angle, so it focuses on topic similarity ignoring document length differences.
Final Answer:
Cosine similarity, because it measures angle ignoring length differences -> Option A
Quick Check:
Topic similarity = cosine similarity [OK]

Hint: For text, angle-based similarity works best [OK]

Common Mistakes:

Using Euclidean which is sensitive to length
Confusing set similarity with vector similarity
Ignoring document length effect

Why Vector similarity metrics in Prompt Engineering / GenAI? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand cosine similarity

Step 2: Compare with other metrics

Final Answer:

Quick Check:

Solution

Step 1: Recall cosine similarity formula

Step 2: Match formula to code

Final Answer:

Quick Check:

Solution

Step 1: Calculate vector difference

Step 2: Compute Euclidean norm

Final Answer:

Quick Check:

Solution

Step 1: Analyze denominator in formula

Step 2: Understand correct formula

Final Answer:

Quick Check:

Solution

Step 1: Understand vector meaning in text

Step 2: Choose metric ignoring length but capturing direction

Final Answer:

Quick Check: