Model Pipeline - Word similarity and analogies
This pipeline learns word meanings by looking at many sentences. It then finds how similar words are or solves analogies like 'king is to queen as man is to ?'.
Jump into concepts and practice - no test required
This pipeline learns word meanings by looking at many sentences. It then finds how similar words are or solves analogies like 'king is to queen as man is to ?'.
2.5 | *
2.0 | *
1.5 | *
1.0 | *
0.5 | *
+------------
1 2 3 4 5 Epochs
| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | 2.3 | N/A | Initial training with high loss as model starts learning word contexts |
| 2 | 1.8 | N/A | Loss decreases as word vectors improve |
| 3 | 1.5 | N/A | Model captures better word relationships |
| 4 | 1.3 | N/A | Loss continues to decrease steadily |
| 5 | 1.2 | N/A | Training converges with stable loss |
vec1 and vec2 in Python using NumPy?king = [0.5, 0.8, 0.3] queen = [0.45, 0.75, 0.35] man = [0.6, 0.7, 0.2] woman = [0.55, 0.65, 0.25]
king - man + woman?king - man + woman but has a flaw:import numpy as np
words = {'king': np.array([0.5, 0.8, 0.3]), 'queen': np.array([0.45, 0.75, 0.35]), 'man': np.array([0.6, 0.7, 0.2]), 'woman': np.array([0.55, 0.65, 0.25])}
result = words['king'] - words['man'] + words['woman']
max_word = None
max_sim = -1
for word, vec in words.items():
sim = np.dot(result, vec) / (np.linalg.norm(result) * np.linalg.norm(vec))
if sim > max_sim:
max_word = word
print(max_word)Paris is to France as Tokyo is to ? Using pre-trained word vectors, which approach is best to find the answer?