Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the purpose of vector similarity metrics in machine learning?
Vector similarity metrics measure how alike two vectors are. They help compare data points, like checking if two images or texts are similar.
Click to reveal answer
beginner
Explain Cosine Similarity in simple terms.
Cosine Similarity measures the angle between two vectors. If the angle is small, the vectors are similar. It ignores their length and focuses on direction.
Click to reveal answer
beginner
What is Euclidean Distance and how does it relate to similarity?
Euclidean Distance is the straight-line distance between two points (vectors). Smaller distance means higher similarity, bigger distance means less similar.
Click to reveal answer
intermediate
How does Jaccard Similarity work for comparing vectors?
Jaccard Similarity compares two sets by dividing the size of their overlap by the size of their union. For vectors, it measures how many features they share compared to total features.
Click to reveal answer
intermediate
Why might you choose Cosine Similarity over Euclidean Distance?
Cosine Similarity focuses on direction, ignoring length, which is useful when magnitude varies but pattern matters. Euclidean Distance considers magnitude, which can be misleading if scale differs.
Click to reveal answer
Which metric measures the angle between two vectors?
ACosine Similarity
BEuclidean Distance
CJaccard Similarity
DManhattan Distance
✗ Incorrect
Cosine Similarity calculates the cosine of the angle between two vectors to measure similarity.
If two vectors have a Euclidean distance of zero, what does that mean?
AThey are identical
BThey are completely different
CThey have no features
DThey have opposite directions
✗ Incorrect
A Euclidean distance of zero means the vectors are exactly the same point in space.
Jaccard Similarity is best used for comparing:
AAngles between vectors
BContinuous numeric vectors
CDistances in 3D space
DSets or binary vectors
✗ Incorrect
Jaccard Similarity compares overlap between sets or binary vectors.
Which similarity metric ignores the length of vectors and focuses on direction?
AEuclidean Distance
BHamming Distance
CCosine Similarity
DJaccard Similarity
✗ Incorrect
Cosine Similarity measures the angle between vectors, ignoring their length.
A higher Euclidean distance between two vectors means:
AThey are more similar
BThey are less similar
CThey have the same direction
DThey have identical features
✗ Incorrect
Greater Euclidean distance means vectors are farther apart and less similar.
Describe three common vector similarity metrics and when you might use each.
Think about what each metric focuses on: angle, distance, or overlap.
You got /3 concepts.
Explain why choosing the right similarity metric matters in machine learning tasks.
Consider how data features and scale affect similarity.
You got /3 concepts.
Practice
(1/5)
1. Which vector similarity metric measures the angle between two vectors to determine how similar they are?
easy
A. Manhattan distance
B. Euclidean distance
C. Cosine similarity
D. Jaccard similarity
Solution
Step 1: Understand cosine similarity
Cosine similarity measures the cosine of the angle between two vectors, showing how aligned they are.
Step 2: Compare with other metrics
Euclidean and Manhattan distances measure gaps, not angles. Jaccard is for sets, not vectors.
Final Answer:
Cosine similarity -> Option C
Quick Check:
Angle-based similarity = Cosine similarity [OK]
Hint: Angle means cosine similarity, distance means Euclidean [OK]
Common Mistakes:
Confusing distance with angle measurement
Thinking Euclidean measures angle
Mixing set similarity with vector similarity
2. Which of the following is the correct Python expression to compute cosine similarity between two vectors a and b using numpy?
easy
A. np.linalg.norm(a - b)
B. np.dot(a, b) * (np.linalg.norm(a) + np.linalg.norm(b))
C. np.sum(np.abs(a - b))
D. np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
Solution
Step 1: Recall cosine similarity formula
Cosine similarity = dot product of vectors divided by product of their lengths (norms).
Step 2: Match formula to code
np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) matches this formula exactly. np.linalg.norm(a - b) is Euclidean distance, C is Manhattan distance, D is incorrect formula.
Final Answer:
np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) -> Option D
Quick Check:
Dot product over norms = cosine similarity [OK]
Hint: Cosine = dot product divided by norms product [OK]
Common Mistakes:
Using subtraction instead of dot product
Multiplying norms instead of dividing
Confusing Euclidean with cosine formula
3. Given vectors a = np.array([1, 2, 3]) and b = np.array([4, 5, 6]), what is the output of np.linalg.norm(a - b)?
Hint: Euclidean norm = sqrt(sum of squared differences) [OK]
Common Mistakes:
Forgetting to square differences
Calculating sum instead of sqrt of sum
Mixing up vector subtraction order
4. Identify the error in this Python code snippet for cosine similarity:
import numpy as np
def cosine_sim(a, b):
return np.dot(a, b) / np.linalg.norm(a) + np.linalg.norm(b)
print(cosine_sim(np.array([1, 0]), np.array([0, 1])))
medium
A. The denominator should multiply norms, not add them
B. np.dot is used incorrectly; should be np.cross
C. Vectors must be normalized before dot product
D. Function is missing return statement
Solution
Step 1: Analyze denominator in formula
The code adds norms: np.linalg.norm(a) + np.linalg.norm(b), but cosine similarity divides by their product.
Step 2: Understand correct formula
Cosine similarity = dot(a,b) / (norm(a) * norm(b)), so addition is wrong here.
Final Answer:
The denominator should multiply norms, not add them -> Option A
Quick Check:
Denominator = product of norms [OK]
Hint: Denominator in cosine similarity multiplies norms [OK]
Common Mistakes:
Adding norms instead of multiplying
Using cross product instead of dot product
Forgetting to return value
5. You have two text documents represented as vectors: doc1 = [1, 0, 2, 1] and doc2 = [0, 1, 1, 1]. Which similarity metric is best to find how similar their topics are, and why?
hard
A. Cosine similarity, because it measures angle ignoring length differences
B. Euclidean distance, because it measures exact gap between vectors
C. Manhattan distance, because it sums absolute differences
D. Jaccard similarity, because it compares set overlap
Solution
Step 1: Understand vector meaning in text
Vectors represent word counts or weights; length can vary by document size.
Step 2: Choose metric ignoring length but capturing direction
Cosine similarity measures angle, so it focuses on topic similarity ignoring document length differences.
Final Answer:
Cosine similarity, because it measures angle ignoring length differences -> Option A
Quick Check:
Topic similarity = cosine similarity [OK]
Hint: For text, angle-based similarity works best [OK]