Embeddings are vectors that represent words or items. Why do embeddings place words with similar meanings close to each other in the vector space?
Think about how words used in similar sentences might relate.
Embeddings are learned by training models to predict words from their context or vice versa. This causes words used in similar contexts to have vectors close together, capturing semantic similarity.
Given two word embeddings as vectors, what is the output of the cosine similarity calculation?
import numpy as np def cosine_similarity(vec1, vec2): dot_product = np.dot(vec1, vec2) norm1 = np.linalg.norm(vec1) norm2 = np.linalg.norm(vec2) return dot_product / (norm1 * norm2) embedding_a = np.array([1, 2, 3]) embedding_b = np.array([2, 4, 6]) result = cosine_similarity(embedding_a, embedding_b) print(round(result, 2))
Consider the angle between vectors that are multiples of each other.
Cosine similarity measures the cosine of the angle between two vectors. If one vector is a multiple of the other, the angle is 0 degrees and cosine similarity is 1.
Which type of model is best suited to learn embeddings that capture semantic meaning of words?
Think about models that learn from context and sequence.
Language models learn to predict words based on context, which helps them learn embeddings that capture semantic relationships.
How does increasing the size of the embedding dimension affect the model's ability to capture semantic meaning?
Think about the trade-off between detail and data needed.
Larger embedding sizes allow the model to represent more complex semantic relationships but need more data to avoid overfitting.
Which metric is most appropriate to evaluate how well embeddings capture semantic similarity between words?
Consider metrics that measure angle or direction rather than magnitude.
Cosine similarity measures the angle between vectors, which reflects semantic similarity regardless of vector length.