0
0
NLPml~20 mins

Word similarity and analogies in NLP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Word Similarity and Analogy Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Understanding cosine similarity in word embeddings
Which statement best describes what cosine similarity measures between two word vectors?
AThe difference in magnitude between the two vectors, ignoring their direction
BThe Euclidean distance between the two vectors, showing how far apart they are in space
CThe sum of the element-wise multiplication of the two vectors without normalization
DThe angle between the two vectors, indicating how similar their directions are regardless of length
Attempts:
2 left
💡 Hint
Think about how cosine similarity focuses on direction rather than length.
Predict Output
intermediate
1:30remaining
Output of word analogy vector operation
Given the following word vectors, what is the result of the operation: vector('king') - vector('man') + vector('woman')? Vectors: king = [0.8, 0.6] man = [0.7, 0.5] woman = [0.6, 0.7]
NLP
import numpy as np

king = np.array([0.8, 0.6])
man = np.array([0.7, 0.5])
woman = np.array([0.6, 0.7])

result = king - man + woman
print(result)
A[0.7 0.8]
B[0.9 0.8]
C[0.6 0.7]
D[0.7 0.6]
Attempts:
2 left
💡 Hint
Perform the vector addition and subtraction step by step.
Model Choice
advanced
2:00remaining
Choosing the best model for capturing word analogies
Which word embedding model is best known for capturing linear relationships that allow solving analogies like 'king' - 'man' + 'woman' = 'queen'?
ABag-of-Words model
BWord2Vec Skip-gram model
CTF-IDF vectorizer
DOne-hot encoding
Attempts:
2 left
💡 Hint
Think about models that learn dense vector representations with semantic relationships.
Metrics
advanced
1:30remaining
Evaluating word similarity with Spearman correlation
You have a list of human-rated word similarity scores and cosine similarity scores from your model. Which metric best measures how well your model's similarity rankings match human judgments?
ASpearman rank correlation coefficient
BMean squared error
CAccuracy
DPrecision
Attempts:
2 left
💡 Hint
Consider metrics that compare rankings rather than exact values.
🔧 Debug
expert
2:30remaining
Debugging incorrect analogy results
You run this code to find the word closest to the vector result of 'king' - 'man' + 'woman' but get an unrelated word. What is the most likely cause? Code snippet: import numpy as np result = embeddings['king'] - embeddings['man'] + embeddings['woman'] closest_word = None max_sim = -1 for word, vec in embeddings.items(): sim = np.dot(result, vec) / (np.linalg.norm(result) * np.linalg.norm(vec)) if sim > max_sim: max_sim = sim closest_word = word print(closest_word)
AThe embeddings dictionary includes the words 'king', 'man', and 'woman', so no issue there
BThe embeddings vectors are not normalized before the operation, causing invalid results
CThe code does not exclude the input words from the search, so it may return 'king' or 'woman' instead of the correct analogy
DThe cosine similarity calculation is incorrect because it uses dot product without normalization
Attempts:
2 left
💡 Hint
Think about what happens if the search includes the original words used in the vector operation.