Bird
0
0

The following code tries to find the word most similar to king - man + woman but has a flaw:

medium📝 Debug Q14 of 15
NLP - Word Embeddings
The following code tries to find the word most similar to king - man + woman but has a flaw:
import numpy as np
words = {'king': np.array([0.5, 0.8, 0.3]), 'queen': np.array([0.45, 0.75, 0.35]), 'man': np.array([0.6, 0.7, 0.2]), 'woman': np.array([0.55, 0.65, 0.25])}
result = words['king'] - words['man'] + words['woman']
max_word = None
max_sim = -1
for word, vec in words.items():
    sim = np.dot(result, vec) / (np.linalg.norm(result) * np.linalg.norm(vec))
    if sim > max_sim:
        max_word = word
print(max_word)

What is the main flaw?
AThe variable max_sim is initialized incorrectly
BDivision by zero occurs due to zero vector norm
CThe dot product is computed without normalizing vectors
DThe code does not exclude the original words from similarity search
Step-by-Step Solution
Solution:
  1. Step 1: Analyze the similarity search loop

    The loop compares the result vector to all words including 'king', 'man', and 'woman' which are part of the calculation.
  2. Step 2: Understand why this is problematic

    Including original words can cause the highest similarity to be the input words themselves, which is usually unwanted and can cause misleading results.
  3. Final Answer:

    The code does not exclude the original words from similarity search -> Option D
  4. Quick Check:

    Exclude input words to avoid bias [OK]
Quick Trick: Exclude input words from similarity search to avoid bias [OK]
Common Mistakes:
MISTAKES
  • Assuming zero division error without checking norms
  • Thinking max_sim initialization causes error
  • Ignoring normalization in dot product

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More NLP Quizzes