Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is cosine similarity?
Cosine similarity is a way to measure how similar two things are by looking at the angle between their vectors. It gives a value between -1 and 1, where 1 means exactly the same direction, 0 means no similarity, and -1 means opposite directions.
Click to reveal answer
beginner
How is cosine similarity calculated between two vectors?
Cosine similarity is calculated by dividing the dot product of two vectors by the product of their lengths (magnitudes). Formula: cosine_similarity = (A · B) / (||A|| * ||B||).
Click to reveal answer
beginner
Why is cosine similarity useful in text analysis?
Cosine similarity helps compare text by turning words into vectors and measuring how close their directions are. It ignores the length of the text, so it focuses on the meaning or topic similarity rather than size.
Click to reveal answer
beginner
What does a cosine similarity score of 0 mean?
A cosine similarity score of 0 means the two vectors are at a 90-degree angle, so they have no similarity or relation in direction.
Click to reveal answer
intermediate
Can cosine similarity be negative? What does that mean?
Yes, cosine similarity can be negative, down to -1. A negative score means the vectors point in opposite directions, showing opposite or very different meanings.
Click to reveal answer
What is the range of cosine similarity values?
A-0.5 to 0.5
B0 to 1
C-1 to 1
D0 to 100
✗ Incorrect
Cosine similarity ranges from -1 (opposite) to 1 (same direction).
Which part of the cosine similarity formula measures the length of a vector?
AMagnitude (length)
BDifference of vectors
CSum of elements
DDot product
✗ Incorrect
Magnitude measures the length of a vector and is used to normalize the dot product.
Why is cosine similarity preferred over Euclidean distance for text similarity?
AIt ignores vector length, focusing on direction
BIt is faster to compute
CIt works only with numbers
DIt measures exact word matches
✗ Incorrect
Cosine similarity focuses on the angle (direction) between vectors, ignoring length differences.
If two text vectors have a cosine similarity of 1, what does it mean?
AThey are completely different
BThey have no words in common
CThey are orthogonal
DThey are identical in direction
✗ Incorrect
A cosine similarity of 1 means the vectors point in the exact same direction.
What does a cosine similarity of 0 indicate about two vectors?
AVectors are identical
BVectors are orthogonal (no similarity)
CVectors are opposite
DVectors have negative correlation
✗ Incorrect
A score of 0 means the vectors are at right angles, showing no similarity.
Explain in your own words what cosine similarity measures and why it is useful in comparing text data.
Think about how you compare two sentences by their meaning, not length.
You got /5 concepts.
Describe how you would calculate cosine similarity between two word vectors step-by-step.
Remember the formula: (A · B) / (||A|| * ||B||).
You got /4 concepts.
Practice
(1/5)
1. What does cosine similarity measure between two vectors?
easy
A. The difference in vector lengths
B. How close the vectors point in the same direction
C. The sum of vector elements
D. The distance between vector origins
Solution
Step 1: Understand vector comparison
Cosine similarity compares the angle between two vectors, not their length or sum.
Step 2: Interpret cosine similarity meaning
A value close to 1 means vectors point in the same direction, showing similarity.
Final Answer:
How close the vectors point in the same direction -> Option B
Quick Check:
Cosine similarity = direction closeness [OK]
Hint: Cosine similarity checks angle, not length or sum [OK]
Common Mistakes:
Confusing cosine similarity with Euclidean distance
Thinking it measures vector length difference
Assuming it sums vector values
2. Which of the following is the correct formula for cosine similarity between vectors A and B?
easy
A. \( \frac{\|A\|}{\|B\|} \)
B. \( \|A - B\| \)
C. \( \frac{A \cdot B}{\|A\| \times \|B\|} \)
D. \( A + B \)
Solution
Step 1: Recall cosine similarity formula
Cosine similarity is the dot product of vectors divided by the product of their lengths.
Step 2: Match formula to options
\( \frac{A \cdot B}{\|A\| \times \|B\|} \) matches the formula \( \frac{A \cdot B}{\|A\| \times \|B\|} \), others do not.
Final Answer:
\( \frac{A \cdot B}{\|A\| \times \|B\|} \) -> Option C
Quick Check:
Cosine similarity = dot product / product of norms [OK]
Hint: Look for dot product over product of lengths [OK]
Common Mistakes:
Choosing Euclidean distance formula
Adding vectors instead of dot product
Dividing norms instead of multiplying
3. Given vectors A = [1, 2, 3] and B = [4, 5, 6], what is the cosine similarity (rounded to 2 decimals)?
Hint: Calculate dot product and divide by product of lengths [OK]
Common Mistakes:
Forgetting to take vector norms
Mixing up dot product with element-wise multiplication
Rounding too early causing wrong answer
4. What is wrong with this Python code to compute cosine similarity?
import numpy as np
def cosine_sim(a, b):
return np.dot(a, b) / np.linalg.norm(a + b)
A = np.array([1, 0])
B = np.array([0, 1])
print(cosine_sim(A, B))
medium
A. It should add vectors before dot product
B. It uses np.dot instead of np.cross
C. It misses normalizing vectors before dot product
D. It divides by norm of sum instead of product of norms
Solution
Step 1: Analyze denominator in code
The code divides by norm of (a + b), but cosine similarity requires product of norms of a and b.
Step 2: Understand correct formula
Correct denominator is np.linalg.norm(a) * np.linalg.norm(b), not norm of sum.
Final Answer:
It divides by norm of sum instead of product of norms -> Option D
Quick Check:
Denominator must be product of norms [OK]
Hint: Denominator is product of norms, not norm of sum [OK]
Common Mistakes:
Using norm of sum instead of product
Confusing dot product with cross product
Normalizing vectors before dot product unnecessarily
5. You have two text documents represented as TF-IDF vectors: doc1 = [0, 1, 2, 0] and doc2 = [1, 0, 1, 1]. Which step is best to improve cosine similarity comparison for very sparse vectors?
hard
A. Normalize vectors to unit length before computing cosine similarity
B. Add the vectors element-wise before similarity
C. Use Euclidean distance instead of cosine similarity
D. Ignore zero elements in vectors
Solution
Step 1: Understand sparse vector challenges
Sparse vectors have many zeros; normalizing to unit length ensures fair angle comparison.
Step 2: Identify best practice for cosine similarity
Normalizing vectors before cosine similarity avoids bias from vector length differences.
Final Answer:
Normalize vectors to unit length before computing cosine similarity -> Option A
Quick Check:
Normalization improves cosine similarity on sparse data [OK]
Hint: Always normalize vectors before cosine similarity [OK]