Word embeddings map words to vectors in space. Why do similar words end up close to each other in this space?
Think about how words that appear in similar sentences might share meaning.
Embeddings are learned by predicting words from their context or vice versa. This causes words used in similar contexts to have similar vectors, capturing semantic similarity.
Given two word embeddings represented as vectors, what is the output of the cosine similarity calculation?
import numpy as np vec1 = np.array([1, 2, 3]) vec2 = np.array([2, 4, 6]) cos_sim = np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2)) print('{:.2f}'.format(cos_sim))
Consider if one vector is a scaled version of the other.
Cosine similarity measures the angle between vectors. If one vector is a positive multiple of the other, cosine similarity is 1.
You want to train word embeddings that capture rich semantic meaning. Which embedding size is most likely to work best?
Think about balancing detail and overfitting.
Embedding sizes around 50 dimensions are common to capture semantic nuances without overfitting or excessive computation.
Which metric is best suited to evaluate if embeddings capture semantic similarity between words?
Think about comparing embedding similarity to human judgments.
Cosine similarity correlation with human similarity scores measures how well embeddings reflect human semantic understanding.
Consider this simplified embedding training snippet. Why do all word vectors end up identical?
import numpy as np vocab = ['cat', 'dog', 'fish'] embeddings = {word: np.zeros(3) for word in vocab} for epoch in range(3): for word in vocab: embeddings[word] += 0.1 print(embeddings)
Look at how the embeddings are initialized and updated.
All embeddings start as zero vectors and each is incremented by the same amount each iteration, so they remain identical.