Challenge - 5 Problems
Jaccard Similarity Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
What is the output of this Jaccard similarity calculation?
Given two sets
A = {1, 2, 3, 4} and B = {3, 4, 5, 6}, what is the Jaccard similarity computed by the code below?NLP
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
intersection = A.intersection(B)
union = A.union(B)
jaccard_similarity = len(intersection) / len(union)
print(round(jaccard_similarity, 2))Attempts:
2 left
💡 Hint
Recall that Jaccard similarity is the size of the intersection divided by the size of the union of two sets.
✗ Incorrect
The intersection of A and B is {3, 4} with size 2. The union is {1, 2, 3, 4, 5, 6} with size 6. So, similarity = 2/6 = 0.3333, rounded to 0.33. But the code rounds to 2 decimals, so output is 0.33.
🧠 Conceptual
intermediate1:30remaining
Which statement best describes Jaccard similarity?
Choose the best description of what Jaccard similarity measures between two sets.
Attempts:
2 left
💡 Hint
Think about how much two sets overlap compared to their total combined size.
✗ Incorrect
Jaccard similarity is defined as the size of the intersection divided by the size of the union of two sets, measuring their overlap.
❓ Metrics
advanced2:00remaining
What is the Jaccard similarity between these two token sets?
Given two token sets from text documents:
doc1 = {'apple', 'banana', 'cherry'} and doc2 = {'banana', 'cherry', 'date', 'fig'}, what is their Jaccard similarity?Attempts:
2 left
💡 Hint
Count the common tokens and total unique tokens.
✗ Incorrect
Intersection is {'banana', 'cherry'} size 2. Union is {'apple', 'banana', 'cherry', 'date', 'fig'} size 5. Similarity = 2/5 = 0.4.
🔧 Debug
advanced2:00remaining
Why does this Jaccard similarity code raise an error?
Consider this code snippet:
def jaccard_similarity(list1, list2):
intersection = list1 & list2
union = list1 | list2
return len(intersection) / len(union)
print(jaccard_similarity(['a', 'b'], ['b', 'c']))
Why does it raise an error?Attempts:
2 left
💡 Hint
Check the data types and operators used for intersection and union.
✗ Incorrect
The '&' and '|' operators work only on sets, not lists. Using lists causes a TypeError.
❓ Model Choice
expert3:00remaining
Which model output best matches Jaccard similarity for text similarity?
You have two text documents and want to measure their similarity using Jaccard similarity on token sets. Which model output below correctly computes this similarity?
Attempts:
2 left
💡 Hint
Consider how to tokenize text properly before computing sets.
✗ Incorrect
Option D splits the documents into words (tokens), converts to sets, then computes intersection and union correctly. Option D treats each character as a token, which is incorrect for word-level similarity. Option D uses list operators '&' and '|' which are invalid. Option D also treats characters as tokens.